Vector Search best practices

Use these practices to get better relevance, recall, and precision from Vector Search and to troubleshoot when results are poor.

Relevance

Right collection — Query the collection that contains the document set for your use case (contracts, invoices, policies). Avoid querying a single “everything” collection without filters.
Metadata filtering — Use filters (source, date, document type) to scope results. See Filtering.
Chunk quality — Smart bites should be meaningful units (e.g. a section, a clause). Poor chunking (e.g. mid-sentence splits) hurts relevance. Tune extraction and chunking in your pipeline. See Extraction overview and Partitioning & chunking.
Freshness — Keep the catalog updated with workflow orchestration so new and updated documents are searchable. Stale data leads to missing or outdated context. See Vector Catalog: Lineage.

Increase k — Retrieve more candidates (e.g. top 10 or 20) and then re-rank or filter in your app if the answer isn’t in the top 3.
Broaden filters — If you over-filter (e.g. narrow date range), relax filters to see if the right document exists in the collection.
Check indexing — Ensure the document was successfully extracted and ingested. Use source lineage and run history to confirm. See Extraction runs.

Tighten filters — Restrict by document type, source, or date so irrelevant docs are excluded.
Confidence — Filter or down-rank low-extraction confidence bites if they cause wrong context. See Extraction best practices.
Surface lineage — Show source lineage in the UI so users can ignore or report bad sources. See Agents grounding.

Confirm the document is in the collection — Check run history and lineage; verify the collection and schema.
Inspect chunks — Look at the smart bites for that document. If chunking is wrong (e.g. too small or too large), adjust extraction/chunking settings.
Try the query — Run the same query in the Platform or API and inspect returned chunks. Add or relax filters and re-run.
Check embedding model — Ensure query and index use the same model; confirm the model fits your language and domain.

Pass enough context — Send enough retrieved chunks to the LLM so the answer can be grounded in the right passage. Balance token limits and relevance.
Cite sources — Always attach source lineage to the response so users can verify grounded answers. See Grounding from catalog.