SDK overview

Bundata provides official SDKs for popular languages so you can call the Bundata API from your code without building HTTP requests by hand. The SDK handles authentication, request formatting, response parsing, and (where applicable) retries and pagination. Use it for extraction, Vector Catalog search, and workflow triggers in your document intelligence pipelines.

What the SDK gives you

  • Typed clients — Methods for extraction, search, schemas, and workflows with clear parameters and return types.
  • Authentication — Set the API key once (e.g. via constructor or environment variable); the SDK attaches it to every request. See Authentication.
  • Errors — Exceptions or error types that map to API status and error codes, so you can branch on RATE_LIMITED, INVALID_SCHEMA, etc. See Error handling.
  • Convenience — Upload a file path or stream, pass a schema ID, and get back smart bites or run IDs without manually building multipart or JSON bodies.

When to use the SDK vs the REST API

  • Use the SDK when you’re building an application in a supported language (e.g. Python, Node) and want the fastest path to extraction, search, and workflows. Ideal for scripts, backends, and automation.
  • Use the REST API when you’re in an unsupported language, need full control over requests (e.g. custom retry logic), or integrate via a generic HTTP client. See REST API overview.

Supported languages and installation

  • Python — Install via pip (e.g. pip install bundata). Import the client, set the API key, and call client.extract.run(...), client.search.query(...), etc. Package name and version are in the product docs; see SDK quickstart.
  • Node / TypeScript — Install via npm (e.g. npm install @bundata/sdk). Same pattern: create client, set key, call methods. Check the repo or npm for the exact package name.

Other languages may be available; check the Bundata website or support for the current list.

Core operations

  • Extraction — Submit a document (file path, buffer, or URL) and schema ID; get a run ID or (for sync-style APIs) wait for smart bites. Use for one-off runs or batch jobs. See Extraction runs.
  • Search — Query a collection with natural language or an embedding; get ranked results with metadata and source lineage. Use in RAG or agent code to retrieve context for grounded answers. See Vector Search overview.
  • Workflows — Trigger a workflow run by ID (and optional params). Poll or use webhooks for completion. See Workflows overview.

Schemas and connectors may be manageable via SDK as well; see the SDK docs for your language.

Configuration and best practices

  • API key — Set via environment variable (e.g. BUNDATA_API_KEY) or config; never hardcode. See Authentication.
  • Base URL — Override if you use an in-VPC or dedicated endpoint. Default is the standard API host for your plan.
  • Timeouts and retries — SDK may expose timeouts and retry options. Use retries for 429 and 5xx; respect rate limits. See Rate limits and Error handling.

Example workflow with the SDK

  1. Initialize client with API key (from env or secrets).
  2. Run extraction on a contract PDF with your schema ID; get run ID or wait for result.
  3. Optionally trigger a workflow that ingests to the Vector Catalog (or do it via API/SDK if you have an ingest method).
  4. Search the collection with the user’s question; pass top results to your LLM and return a grounded answer with source lineage.

See SDK quickstart for a minimal runnable example.

Next steps