Semantic retrieval
Semantic retrieval in Vector Search finds smart bites by meaning, not just keywords. The user’s query (e.g. “What is the refund policy?”) is turned into an embedding and compared to the embeddings of indexed chunks. The closest matches are returned, so RAG and agents get context that is relevant to the question even when wording differs from the document.
How it works
- Indexing — During ingestion, each smart bite is embedded using the collection’s embedding model. Embeddings are stored in the Vector Catalog with metadata and source lineage.
- Query — The user or agent sends a natural-language query (or a pre-computed embedding). The query is embedded with the same model.
- Similarity — The system finds the indexed embeddings most similar to the query embedding (e.g. cosine similarity or other distance metric). Results are ranked by relevance.
- Return — Top-k chunks are returned with metadata and lineage so the app or LLM can use them for grounded answers.
When to use semantic retrieval
- RAG — Retrieve context for an LLM so answers are based on your documents. Semantic search improves over keyword search when users ask in natural language.
- Agents — Ground agent responses with retrieved chunks; semantic retrieval ensures the right documents are pulled in.
- Search UX — Let users ask questions in their own words over contracts, policies, or operational docs.
Choosing a collection
- Run semantic search per collection (or over collections that share the same embedding model and schema). Choose the collection that contains the document set relevant to the query (e.g. contracts, invoices, policy docs).
- Use filtering (metadata) to narrow results within a collection. See Filtering.
Embedding model and language
- The embedding model is configured at the collection level. Use a model that matches your language and domain.
- Same model for indexing and querying — Query embeddings must use the same model as the collection; Bundata handles this when you use the search API.
Common pitfalls
- Empty or stale collection — Run extraction and ingestion first; keep the catalog updated via workflow orchestration.
- Wrong collection — Querying the wrong collection returns irrelevant results. Scope by use case and document type.
- No metadata filtering — For large collections, add filters (e.g. date, source) to improve relevance and reduce noise.
Next steps
- Vector Search overview — Concepts and use cases.
- Filtering — Metadata filters.
- Best practices — Recall, precision, troubleshooting.