Source connectors

Source connectors pull documents from your storage and apps into Bundata for context-aware extraction. Bundata supports 35+ source connectors including cloud storage, collaboration tools, and business applications. Configure them in the Platform or via API; use them in workflows for workflow orchestration and continuous ingestion.

Cloud storage

ConnectorUse for
Amazon S3Buckets and prefixes; IAM or key-based auth. See AWS.
Azure Blob StorageContainers; key or managed identity. See Azure.
Google Cloud StorageBuckets; service account or key. See GCP.

Use for contracts, invoices, and policy docs stored in object storage. Trigger workflows on a schedule or (where supported) on new object events.

Collaboration and content

ConnectorUse for
SharePointSites, libraries, folders. OAuth or app registration.
ConfluenceSpaces and pages. API token or OAuth.
Google DriveMy Drive and shared drives. OAuth.
BoxFolders and files. OAuth.

Use for internal policies, playbooks, and shared documents. Sync on schedule or via events where supported.

Business apps

ConnectorUse for
SalesforceAttachments, content versions. OAuth.
ZendeskTickets, articles. API or OAuth.

Use for support tickets, contract attachments, and knowledge-base content. See Reference: Connectors for the full list and parameters.

Configuration

  • Credentials — Each connector has an auth method (API key, OAuth, IAM). Store credentials securely; use least-privilege access. See Authentication.
  • Scope — Configure which folders, buckets, or objects to read. Limit scope to reduce cost and improve security.
  • File types — Connectors typically respect supported file types. Unsupported files are skipped or reported; check run logs.

Workflows

  • Attach a source connector to a workflow so extraction runs on new or updated documents. See Triggers & scheduling.
  • Ensure source lineage (document ID, path) is preserved so smart bites in the Vector Catalog are traceable. See Vector Catalog: Lineage.

Common pitfalls

  • Expired or invalid credentials — Connector runs fail; rotate keys and update configuration. Monitor run history. See Monitoring.
  • Too broad scope — Ingesting entire tenants or buckets can be slow and expensive. Start with a folder or prefix and expand as needed.
  • Unsupported file types — Filter at source or handle “skipped” files in run results; see Supported file types.

Next steps