Knowledge assistants

Knowledge assistants are AI copilots that answer questions and perform tasks using your documents—contracts, policies, procurement records, legal filings. Bundata supplies the document intelligence layer: context-aware extraction, smart bites in the Vector Catalog, and Vector Search so the assistant can retrieve relevant context and deliver grounded answers with source lineage.

What you need from Bundata

Extraction — Documents turned into schema-aware chunks with metadata and optional enrichment (NER, table description). See Extraction overview.
Vector Catalog — Indexed smart bites and embeddings. See Vector Catalog overview.
Vector Search — Semantic retrieval so the assistant gets the right chunks for each user question. See Vector Search overview and Grounding from catalog.

Use cases

Contract assistant — “What’s the termination clause in this agreement?” Query the catalog for the contract, retrieve relevant sections, and ground the answer with citations.
Policy assistant — “What is our refund policy?” Search policy docs, return matching sections, and let the LLM summarize with source links.
Legal / compliance — “Which contracts mention liability cap X?” Filter by metadata and search; return list with source lineage for review.
Procurement / ops — “What did we agree with vendor Y?” Search contracts and invoices; ground answers in extracted terms and line items.

Design patterns

Index by use case — Use separate collections (or strong metadata) for contracts, policies, invoices so the assistant can scope queries. See Collections.
Retrieve then generate — Run Vector Search with the user’s question, pass top-k smart bites to the LLM, and generate an answer. Always attach source lineage to the response. See Grounding from catalog.
Filter when possible — Use metadata (date, document type, source) so retrieval is relevant. See Vector Search: Filtering.
Confidence and quality — Prefer high-extraction confidence bites; filter or flag low-confidence content so the assistant doesn’t ground on unreliable text. See Extraction best practices.

Common pitfalls

Stale catalog — Keep ingestion updated with workflow orchestration so new and updated docs are searchable.
No citations — Always show which document and (if possible) section each part of the answer came from.
One-size-fits-all collection — Use multiple collections or strong metadata so the assistant doesn’t mix unrelated doc types.

Next steps

Grounding from catalog — Retrieve and pass context to the LLM.
First agent — Build your first agent.
Vector Search best practices — Relevance and troubleshooting.