Triggers & scheduling
Workflow orchestration in Bundata lets you run pipelines on a schedule, when source events occur (e.g. new files), or on demand. This page covers how to configure triggers and scheduling so extraction and delivery stay up to date without manual runs.
Schedule-based triggers
- Cron or interval — Run a workflow at a fixed time (e.g. nightly at 2 AM) or at an interval (e.g. every 6 hours). Use for batch processing of documents that arrive in a known location (S3 bucket, SharePoint folder).
- Use case — “Process all new contracts uploaded to S3 every night and push smart bites to the Vector Catalog.” Keeps search and agents fresh. See Workflows overview.
Configure the schedule in the Platform workflow editor or via API. Specify timezone and recurrence (daily, weekly, custom cron).
Event-based triggers
- Source events — When a connector supports it, trigger a workflow when new or updated documents appear (e.g. new file in a folder, new message in a channel). Each event can run the workflow for the affected document(s).
- Use case — “As soon as a contract is uploaded to SharePoint, run extraction and update the catalog.” Enables near-real-time search and grounded answers on the latest docs.
Event triggers depend on connector support. See Integrations: Source connectors and your connector docs.
On-demand runs
- Manual trigger — From the Platform UI or API, start a workflow run for a selected source or document set. Use for one-off backfills, tests, or reprocessing after a schema change.
- API trigger — Your app can call the workflow API to start a run (e.g. after a user uploads a file in your app). See REST API overview.
Choosing a trigger type
| Need | Trigger type |
|---|---|
| Batch sync (e.g. nightly) | Schedule |
| Near-real-time updates | Event (if connector supports) |
| Backfill or test | On-demand |
| User-driven (e.g. “process this file”) | API / on-demand |
Best practices
- Match schedule to data velocity — Don’t run hourly if documents arrive weekly; avoid overloading the system or hitting rate limits. See Rate limits.
- Idempotency — Design workflows so re-running for the same document doesn’t duplicate or corrupt data (e.g. upsert by document ID in the Vector Catalog).
- Monitoring — Use Monitoring to alert on failed runs and fix triggers or connectors.
Next steps
- Workflows overview — Concepts and DAG.
- Monitoring — Run history and alerts.
- Extraction runs — What runs inside a workflow step.