Vector Search best practices
Use these practices to get better relevance, recall, and precision from Vector Search and to troubleshoot when results are poor.
Relevance
- Right collection — Query the collection that contains the document set for your use case (contracts, invoices, policies). Avoid querying a single “everything” collection without filters.
- Metadata filtering — Use filters (source, date, document type) to scope results. See Filtering.
- Chunk quality — Smart bites should be meaningful units (e.g. a section, a clause). Poor chunking (e.g. mid-sentence splits) hurts relevance. Tune extraction and chunking in your pipeline. See Extraction overview and Partitioning & chunking.
- Freshness — Keep the catalog updated with workflow orchestration so new and updated documents are searchable. Stale data leads to missing or outdated context. See Vector Catalog: Lineage.
Recall (finding the right docs)
- Increase k — Retrieve more candidates (e.g. top 10 or 20) and then re-rank or filter in your app if the answer isn’t in the top 3.
- Broaden filters — If you over-filter (e.g. narrow date range), relax filters to see if the right document exists in the collection.
- Check indexing — Ensure the document was successfully extracted and ingested. Use source lineage and run history to confirm. See Extraction runs.
Precision (avoiding wrong docs)
- Tighten filters — Restrict by document type, source, or date so irrelevant docs are excluded.
- Confidence — Filter or down-rank low-extraction confidence bites if they cause wrong context. See Extraction best practices.
- Surface lineage — Show source lineage in the UI so users can ignore or report bad sources. See Agents grounding.
Troubleshooting poor recall or precision
- Confirm the document is in the collection — Check run history and lineage; verify the collection and schema.
- Inspect chunks — Look at the smart bites for that document. If chunking is wrong (e.g. too small or too large), adjust extraction/chunking settings.
- Try the query — Run the same query in the Platform or API and inspect returned chunks. Add or relax filters and re-run.
- Check embedding model — Ensure query and index use the same model; confirm the model fits your language and domain.
RAG and agents
- Pass enough context — Send enough retrieved chunks to the LLM so the answer can be grounded in the right passage. Balance token limits and relevance.
- Cite sources — Always attach source lineage to the response so users can verify grounded answers. See Grounding from catalog.
Next steps
- Semantic retrieval — How retrieval works.
- Filtering — Metadata filters.
- Extraction best practices — Quality and confidence.