Application architecture - BonData Documentation

BonData is a master data management (MDM) and AI automation platform. Users connect their business systems as integrations, let AI match and unify records across them with bonds, build agents that automate data work across those systems, and either run agents on demand or schedule them. An agent is a graph of nodes (data fetch, filter, transform, AI enrichment, code execution, action) that the platform executes in order, reading from the connected integrations and writing the result back as a notification, an export, or an update to a record in one of the source systems. The platform that runs all of this is the same in every deployment, multi-tenant Cloud SaaS, single-tenant Dedicated Cloud in a BonData-operated account, or single-tenant Cloud-Prem inside a customer’s AWS account. This page describes the application itself: its user-facing surfaces, the services that execute work, and the data layer underneath. For how the platform is arranged in each deployment, see Deployment scenarios.

Workflow overview

A user signs in through one of the access-layer surfaces, typically the webapp, where they build and run agents, browse the connected data warehouse, and chat with the AI assistant. The webapp talks to the user-facing API. The API authenticates the request, resolves the caller to a tenant, and either serves a synchronous response from the operational database or enqueues an asynchronous job on the message broker. When an agent runs, a workflow runner picks up the job and walks the agent’s node graph in topological order. Data nodes pull records from connected integrations; transform and filter nodes run in-memory; enrichment nodes call AI providers or run generated code in a sandboxed microVM; action nodes write back to integrations or deliver notifications. Each step is logged with a tenant identifier and a request ID so the entire run can be traced end-to-end. The same workflow engine powers the chat agent, which uses the MCP server to expose BonData’s tool surface to an LLM, so a user can ask the assistant to run an agent, query the data warehouse, or take an action, and the same authorization and tenancy rules apply.

Chat agent guardrails

The chat agent inherits the calling user’s identity end-to-end: every tool invocation runs under the user’s tenant, role, and integration scopes, and is subject to the same authorization checks as a direct API call. The agent cannot access data or invoke integrations the user is not already entitled to. Sensitive actions surface a confirmation step in the chat UI before they execute. All agent activity is logged with the same request ID, tenant, and user attribution as the rest of the platform (see Audit logging).

Core components

BonData is organized into three layers, an access layer, an application layer, and a data and messaging layer.

Access layer

External traffic reaches BonData through Cloudflare, which provides DNS, TLS termination, web-application firewall rules, and DDoS protection at the edge. From Cloudflare, traffic is forwarded to an AWS Application Load Balancer, which terminates a second TLS hop using a certificate from AWS Certificate Manager and routes to services running in the cluster. For deployments that cannot accept inbound traffic at all, the Cloudflare Tunnel controller dials outbound from the cluster and Cloudflare brings traffic over the tunnel.

Application layer

The application layer runs on Amazon EKS with workloads in private subnets across three availability zones. It is composed of:

API services. A user-facing FastAPI service, a separate management API, and an MCP server that exposes BonData’s tool surface to AI clients over the Model Context Protocol with Server-Sent Events for streaming.
Web applications. A user-facing React webapp built with Vite, an administrative webapp, and a chat-agent service that uses the Anthropic SDK and acts as an MCP client.
Workflow engine. Workflows are graphs of nodes (data fetch, filter, transform, enrichment, action). When a workflow runs, the API enqueues a job on the broker and a workflow runner picks it up.
Asynchronous runners. Two kinds of pod do most of the heavy work. Queue consumers subscribe to the broker and execute workflow jobs, integration syncs, and data refinement. Scheduled runners run on intervals, refreshing integration credentials, monitoring integration health, delivering notifications, aggregating data quality, and reconciling state.
Sandboxed code execution. When a workflow needs to run generated code, the code-execution node sends the generated code and the input variables the workflow node passes (not the full workflow state) to e2b Code Interpreter, where it executes in an ephemeral Firecracker microVM hosted in the United States. Each sandbox is kernel-isolated and destroyed at the end of the execution. Network access from inside the sandbox depends on the workflow’s configuration. e2b maintains its own compliance posture at trust.e2b.dev, which customers can review as part of their vendor due diligence.

Authentication, authorization, and tenant resolution happen at the API tier on every request; see Authentication and Access control.

Data and messaging layer

State is held in a small set of managed AWS services and one in-cluster cache:

Layer	Service
Operational database	Amazon RDS for PostgreSQL with the `pgvector` extension. Holds users, tenants, agent definitions, integration metadata, audit records, and vector embeddings.
Data lake	Apache Iceberg tables on Amazon S3, queried with AWS Athena, catalogued with AWS Glue. Holds raw and refined integration data and large analytic datasets.
Analytics	Amazon Redshift Serverless. Loaded by the data pipeline; not on the user request path.
Cache	Clustered in-cluster Redis. Deduplication, session caching, runtime caches.
Object storage (internal)	Private Amazon S3 bucket for internal application artifacts. SSE-S3 at rest, versioning enabled.
Object storage (public)	Amazon S3 bucket fronted by Amazon CloudFront for user-published outputs. Only artifacts a user has explicitly published are exposed, and only via signed CloudFront URLs.
Messaging	Amazon MQ for RabbitMQ in a multi-AZ clustered configuration in production. AMQPS only, port 5671.

Job payloads on the broker are typically small references (record IDs, S3 keys) rather than the records themselves; the records are read on demand from RDS or S3.

AI providers

The default LLM is Anthropic Claude. Workflows can be configured to use Google Gemini or OpenAI models. All model calls are HTTPS egress; no model-provider component is deployed in the cluster. Anthropic, Google, and OpenAI API usage is governed by each provider’s commercial terms, which prohibit using customer inputs and outputs for model training.

Embeddings

Two embedding options are supported:

OpenAI text-embedding-3-small (default). Used in Cloud SaaS and as the default in Cloud-Prem. HTTPS egress to OpenAI; not used for training under OpenAI’s commercial API terms.
AWS Bedrock Titan Embeddings (enterprise packages only). Runs inside the customer’s AWS region in their Cloud-Prem deployment, so the embedding call never leaves the customer’s AWS account. Available on enterprise-tier engagements; selected at deployment time.

Identity

Identity is provided by Descope, with separate Descope projects for the user-facing application and the management application. SSO over SAML or OIDC, MFA, and session controls are configured in Descope. Applications validate Descope-issued JWTs locally on every request. See Authentication.

Secrets

All credentials and application-layer encryption keys are stored in AWS Secrets Manager and synced into the cluster by the External Secrets Operator. No long-lived AWS keys live in the cluster, in CI, or in the codebase. See Secrets management.

Observability

Layer	Service
Errors	Sentry
Logs	New Relic, shipped via Fluent Bit
Metrics	New Relic Kubernetes exporter and in-cluster Prometheus (`kube-prometheus-stack`)
Audit and request tracing	Structured JSON logs across all services, with a request ID that ties every downstream log line to the originating request

See Audit logging for what is captured and where it lives.

Ports and protocols

Service	Port	Protocol	Direction	Notes
Cloudflare → ALB	443	TCP / TLS 1.2+	Inbound	TLS re-terminated at the ALB.
ALB → API services	8000	HTTP	In-cluster	Internal, behind the ALB.
ALB → MCP server	8081	HTTP + SSE	In-cluster	Server-Sent Events for streaming tool calls.
API / runners → RDS PostgreSQL	5432	TCP	In-cluster	TLS in transit.
API / runners → Amazon MQ	5671	AMQPS	In-cluster	TLS in transit.
API / runners → Redis	6379	TCP	In-cluster
API / runners → AI providers	443	HTTPS	Outbound	Anthropic, OpenAI, Google, e2b.
API / runners → integration endpoints	443	HTTPS	Outbound	Per-integration SaaS or warehouse APIs.
External Secrets Operator → AWS Secrets Manager	443	HTTPS	Outbound	IRSA-bound, scoped to the BonData secret prefix.

High availability

Workloads are distributed across three availability zones. Amazon EKS manages the Kubernetes control plane with multi-AZ redundancy under AWS’s EKS service-level agreement. A NAT gateway is provisioned in each AZ so an AZ failure does not sever cluster egress. The API tier runs with multiple replicas across AZs behind a cross-zone-balanced Application Load Balancer; unhealthy targets are removed automatically via ALB health checks. Amazon RDS runs Multi-AZ with synchronous standby replication and automated failover. Amazon MQ (RabbitMQ) runs in a clustered multi-AZ configuration with mirrored durable queues so in-flight messages survive a broker failover.

​Workflow overview

​Chat agent guardrails

​Core components

​Access layer

​Application layer

​Data and messaging layer

​AI providers

​Embeddings

​Identity

​Secrets

​Observability

​Ports and protocols

​High availability