Vector Search best practices

Use these practices to get better relevance, recall, and precision from Vector Search and to troubleshoot when results are poor.

Relevance

  • Right collection — Query the collection that contains the document set for your use case (contracts, invoices, policies). Avoid querying a single “everything” collection without filters.
  • Metadata filtering — Use filters (source, date, document type) to scope results. See Filtering.
  • Chunk qualitySmart bites should be meaningful units (e.g. a section, a clause). Poor chunking (e.g. mid-sentence splits) hurts relevance. Tune extraction and chunking in your pipeline. See Extraction overview and Partitioning & chunking.
  • Freshness — Keep the catalog updated with workflow orchestration so new and updated documents are searchable. Stale data leads to missing or outdated context. See Vector Catalog: Lineage.

Recall (finding the right docs)

  • Increase k — Retrieve more candidates (e.g. top 10 or 20) and then re-rank or filter in your app if the answer isn’t in the top 3.
  • Broaden filters — If you over-filter (e.g. narrow date range), relax filters to see if the right document exists in the collection.
  • Check indexing — Ensure the document was successfully extracted and ingested. Use source lineage and run history to confirm. See Extraction runs.

Precision (avoiding wrong docs)

  • Tighten filters — Restrict by document type, source, or date so irrelevant docs are excluded.
  • Confidence — Filter or down-rank low-extraction confidence bites if they cause wrong context. See Extraction best practices.
  • Surface lineage — Show source lineage in the UI so users can ignore or report bad sources. See Agents grounding.

Troubleshooting poor recall or precision

  1. Confirm the document is in the collection — Check run history and lineage; verify the collection and schema.
  2. Inspect chunks — Look at the smart bites for that document. If chunking is wrong (e.g. too small or too large), adjust extraction/chunking settings.
  3. Try the query — Run the same query in the Platform or API and inspect returned chunks. Add or relax filters and re-run.
  4. Check embedding model — Ensure query and index use the same model; confirm the model fits your language and domain.

RAG and agents

  • Pass enough context — Send enough retrieved chunks to the LLM so the answer can be grounded in the right passage. Balance token limits and relevance.
  • Cite sources — Always attach source lineage to the response so users can verify grounded answers. See Grounding from catalog.

Next steps