Vector Search overview

Vector Search lets you query the Vector Catalog with natural language or embeddings to retrieve relevant smart bites. Results are used for RAG (retrieval-augmented generation), agent grounding, and semantic search over your document base.

What Vector Search does

  • Semantic retrieval — Find content by meaning, not just keywords. Queries are embedded and matched against indexed smart bites.
  • Filtering — Restrict results by metadata (source, date, document type, etc.).
  • Grounded answers — Returned chunks include source lineage so you can cite and trace answers.
  • You are building RAG — Retrieve context for an LLM from the Vector Catalog.
  • You are building agents — Ground agent responses with catalog results.
  • You need semantic search — Let users ask questions in natural language over your docs.

How it fits with the rest of Bundata

  • Vector Catalog — Search runs over collections in the catalog. Ensure extraction and ingestion pipelines have populated the catalog.
  • Schemas — Metadata and structure from your schema are available for filtering and display.
  • Agents — Agents use Vector Search (or equivalent API) to fetch context before generating answers.

Choosing a collection

Search is scoped to collections in the Vector Catalog. Each collection typically corresponds to a schema or use case. Choose the collection that matches the documents and query intent you need.

Common mistakes

  • Searching an empty or stale collection — Run ingestion and extraction first; ensure workflows keep the catalog up to date.
  • No filtering — Use metadata filters to narrow results and improve relevance.
  • Ignoring source lineage — Always surface source and document in UI or agent responses for trust and debugging.

Troubleshooting

  • Poor recall or irrelevant results — Check that the collection is populated and up to date; tighten metadata filters. See Best practices.
  • Slow queries — Reduce result set size or scope to fewer collections; check Limits and quotas.

Next steps