Skip to main content
If your RDS database is in a private VPC with no public access, there are several ways to securely connect it to BonData. The right approach depends on your security requirements, data volume, and infrastructure.
Using a different cloud? See GCP Cloud SQL or Azure SQL.

S3 + Lambda

Self-service setup with Terraform

Tunnel Agent

Lightweight agent in your VPC

AWS PrivateLink

Private endpoint, no public internet

VPC Peering

Direct network link between VPCs

Site-to-Site VPN

Encrypted tunnel over the internet

Direct Connect

Dedicated physical connection
Not sure which option is right for you? The S3 + Lambda approach works for most teams and you can set it up entirely on your own. For all other options, reach out to our team — we’ll help you evaluate your setup and find the best path forward.

Option 1: Export to S3 via Lambda

RecommendedSelf-service
A Lambda function runs inside your VPC on a schedule, queries RDS, converts results to Parquet, and writes them to S3. BonData reads from S3 via its native S3 integration.
RDS (private) ──▶ Lambda (your VPC) ──▶ S3 bucket ──▶ BonData

                   EventBridge (schedule)
Why this approach works best for most teams:
  • No firewall changes — Lambda runs inside your VPC
  • Zero DB performance impact — queries run on your schedule
  • Database credentials never leave your AWS account
  • Fully self-service — no coordination with BonData needed

Deploy with Terraform

Create a bondata-rds-export.tf file and fill in the variables at the top. This provisions the S3 bucket, Lambda function, IAM role, EventBridge schedule, and networking in one apply.
Store db_password in Terraform Cloud or pass it via TF_VAR_db_password to avoid committing secrets.
# ──────────────────────────────────────────────
# Variables — fill these in
# ──────────────────────────────────────────────
variable "aws_region"   { default = "us-east-1" }
variable "vpc_id"       { description = "VPC where RDS lives" }
variable "subnet_ids"   { description = "Private subnets that can reach RDS" type = list(string) }
variable "rds_sg_id"    { description = "Security group of your RDS instance" }
variable "db_host"      { description = "RDS endpoint" }
variable "db_port"      { default = "5432" }
variable "db_name"      { description = "Database name" }
variable "db_user"      { description = "Database user" }
variable "db_password"  { sensitive = true }
variable "tables"       { description = "Comma-separated tables" default = "public.users,public.orders" }
variable "schedule"     { default = "rate(1 hour)" description = "EventBridge schedule expression" }
variable "bucket_name"  { default = "bondata-rds-exports" }

provider "aws" { region = var.aws_region }

# ──────────────────────────────────────────────
# S3 bucket
# ──────────────────────────────────────────────
resource "aws_s3_bucket" "export" {
  bucket = var.bucket_name
}

resource "aws_s3_bucket_public_access_block" "export" {
  bucket                  = aws_s3_bucket.export.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

# ──────────────────────────────────────────────
# VPC endpoint for S3 (so Lambda can reach S3)
# ──────────────────────────────────────────────
data "aws_route_tables" "private" {
  vpc_id = var.vpc_id
}

resource "aws_vpc_endpoint" "s3" {
  vpc_id       = var.vpc_id
  service_name = "com.amazonaws.${var.aws_region}.s3"
  route_table_ids = data.aws_route_tables.private.ids
}

# ──────────────────────────────────────────────
# Security group — allows Lambda to reach RDS
# ──────────────────────────────────────────────
resource "aws_security_group" "lambda" {
  name_prefix = "bondata-export-lambda-"
  vpc_id      = var.vpc_id
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_vpc_security_group_ingress_rule" "rds_from_lambda" {
  security_group_id            = var.rds_sg_id
  referenced_security_group_id = aws_security_group.lambda.id
  from_port                    = var.db_port
  to_port                      = var.db_port
  ip_protocol                  = "tcp"
}

# ──────────────────────────────────────────────
# IAM role for Lambda
# ──────────────────────────────────────────────
resource "aws_iam_role" "lambda" {
  name_prefix = "bondata-export-"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{ Effect = "Allow", Principal = { Service = "lambda.amazonaws.com" }, Action = "sts:AssumeRole" }]
  })
}

resource "aws_iam_role_policy" "lambda" {
  role = aws_iam_role.lambda.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      { Effect = "Allow", Action = ["s3:PutObject"], Resource = "${aws_s3_bucket.export.arn}/*" },
      { Effect = "Allow", Action = ["s3:ListBucket"], Resource = aws_s3_bucket.export.arn },
      { Effect = "Allow", Action = ["logs:CreateLogGroup","logs:CreateLogStream","logs:PutLogEvents"], Resource = "arn:aws:logs:*:*:*" },
      { Effect = "Allow", Action = ["ec2:CreateNetworkInterface","ec2:DescribeNetworkInterfaces","ec2:DeleteNetworkInterface"], Resource = "*" }
    ]
  })
}

# ──────────────────────────────────────────────
# Lambda function
# ──────────────────────────────────────────────
data "archive_file" "lambda" {
  type        = "archive"
  output_path = "${path.module}/lambda.zip"
  source {
    content  = <<-PYTHON
import os, io, json, logging
from datetime import datetime, timezone
import boto3, psycopg2, pyarrow as pa, pyarrow.parquet as pq

logger = logging.getLogger()
logger.setLevel(logging.INFO)

DB = dict(host=os.environ["DB_HOST"], port=int(os.environ.get("DB_PORT","5432")),
          dbname=os.environ["DB_NAME"], user=os.environ["DB_USER"], password=os.environ["DB_PASSWORD"])
BUCKET  = os.environ["S3_BUCKET"]
PREFIX  = os.environ.get("S3_PREFIX", "rds-exports")
TABLES  = [t.strip() for t in os.environ["TABLES"].split(",")]
CHUNK   = int(os.environ.get("CHUNK_SIZE", "50000"))
s3 = boto3.client("s3")

def export_table(cur, table, ts):
    safe = table.replace('"','').replace('.','__')
    cur.execute(f"SELECT * FROM {table} LIMIT 0")
    cols = [d[0] for d in cur.description]
    cur.execute(f"DECLARE _c CURSOR FOR SELECT * FROM {table}")
    part, total = 0, 0
    while True:
        cur.execute(f"FETCH {CHUNK} FROM _c")
        rows = cur.fetchall()
        if not rows: break
        tbl = pa.table({c: [r[i] for r in rows] for i,c in enumerate(cols)})
        buf = io.BytesIO()
        pq.write_table(tbl, buf); buf.seek(0)
        s3.upload_fileobj(buf, BUCKET, f"{PREFIX}/{safe}/dt={ts}/part-{part:05d}.parquet")
        total += len(rows); part += 1
    cur.execute("CLOSE _c")
    return total

def handler(event, context):
    ts = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H%M%SZ")
    conn = psycopg2.connect(**DB)
    try:
        conn.autocommit = False; cur = conn.cursor()
        res = {}
        for t in TABLES:
            try: res[t] = export_table(cur, t, ts)
            except Exception as e: logger.error(f"{t}: {e}"); res[t] = str(e); conn.rollback()
        conn.commit()
    finally: conn.close()
    logger.info(json.dumps(res))
    return {"statusCode": 200, "results": res}
    PYTHON
    filename = "index.py"
  }
}

resource "aws_lambda_function" "export" {
  function_name = "bondata-rds-export"
  role          = aws_iam_role.lambda.arn
  handler       = "index.handler"
  runtime       = "python3.12"
  timeout       = 300
  memory_size   = 512
  filename      = data.archive_file.lambda.output_path
  source_code_hash = data.archive_file.lambda.output_base64sha256

  vpc_config {
    subnet_ids         = var.subnet_ids
    security_group_ids = [aws_security_group.lambda.id]
  }

  environment {
    variables = {
      DB_HOST     = var.db_host
      DB_PORT     = var.db_port
      DB_NAME     = var.db_name
      DB_USER     = var.db_user
      DB_PASSWORD = var.db_password
      S3_BUCKET   = aws_s3_bucket.export.id
      TABLES      = var.tables
    }
  }

  layers = [] # Add your psycopg2 + pyarrow layer ARN here — see note below
}

# ──────────────────────────────────────────────
# EventBridge schedule
# ──────────────────────────────────────────────
resource "aws_cloudwatch_event_rule" "schedule" {
  name                = "bondata-rds-export"
  schedule_expression = var.schedule
}

resource "aws_cloudwatch_event_target" "lambda" {
  rule = aws_cloudwatch_event_rule.schedule.name
  arn  = aws_lambda_function.export.arn
}

resource "aws_lambda_permission" "eventbridge" {
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.export.function_name
  principal     = "events.amazonaws.com"
  source_arn    = aws_cloudwatch_event_rule.schedule.arn
}

# ──────────────────────────────────────────────
# Outputs
# ──────────────────────────────────────────────
output "bucket"      { value = aws_s3_bucket.export.id }
output "lambda_name" { value = aws_lambda_function.export.function_name }
Lambda layer: The function requires psycopg2 and pyarrow. Build a layer or use a public one:
docker run --rm -v $(pwd):/out python:3.12 bash -c \
  "pip install psycopg2-binary==2.9.9 pyarrow==15.0.0 -t /out/python && cd /out && zip -r layer.zip python"

aws lambda publish-layer-version \
  --layer-name bondata-rds-deps \
  --zip-file fileb://layer.zip \
  --compatible-runtimes python3.12
Then add the layer ARN to the layers list in the Terraform file.

Deploy

terraform init
terraform apply -var="vpc_id=vpc-XXX" \
  -var='subnet_ids=["subnet-AAA","subnet-BBB"]' \
  -var="rds_sg_id=sg-XXX" \
  -var="db_host=mydb.abc123.us-east-1.rds.amazonaws.com" \
  -var="db_name=production" \
  -var="db_user=bondata_user" \
  -var="db_password=CHANGEME" \
  -var="tables=public.users,public.orders"

Connect S3 to BonData

Once data is flowing, connect BonData to the bucket using the S3 integration:
  1. In BonData, go to IntegrationsAdd IntegrationAmazon S3
  2. Enter your bucket name and the prefix (default: rds-exports)
  3. Provide IAM credentials with read-only access to the bucket

Option 2: BonData Tunnel Agent

A lightweight Docker container that runs inside your VPC and creates a secure outbound tunnel to BonData. Once running, BonData can query your database directly through the encrypted connection — no inbound firewall rules, no VPN, no public exposure.
┌─────────────────────────────────────────┐
│              Your VPC                   │
│                                         │
│  ┌─────────┐       ┌────────────────┐  │
│  │   RDS   │◀──────│ BonData Tunnel │──┼──▶ BonData Cloud (port 443 outbound)
│  │(private)│       │    Agent       │  │
│  └─────────┘       └────────────────┘  │
│                                         │
└─────────────────────────────────────────┘
Best for: Teams that need real-time query access with minimal infrastructure changes. The agent only requires outbound HTTPS (port 443) and can run on any Docker host — EC2, ECS, EKS, or Fargate. Database credentials stay in your environment and all traffic is encrypted end-to-end.

Get started with the Tunnel Agent

Contact our team to provision your tunnel token and walk through deployment for your environment.

AWS PrivateLink creates a private endpoint in your VPC that routes traffic to BonData without it ever crossing the public internet. Traffic stays entirely within the AWS network. Best for: Organizations with strict compliance requirements (HIPAA, SOC 2) that prohibit any data traversal over the public internet, even when encrypted. PrivateLink provides the strongest network-level isolation without the complexity of VPC Peering or VPN. How it works:
  • BonData exposes a VPC Endpoint Service in its AWS account
  • You create an Interface VPC Endpoint in your VPC pointing to that service
  • Your RDS traffic flows privately through the AWS backbone — no internet gateway, no NAT, no public IPs

Set up PrivateLink

Contact our team to get BonData’s endpoint service name and configure PrivateLink for your account.

Option 4: VPC Peering

VPC Peering creates a direct network route between your VPC and BonData’s VPC, allowing private IP communication as if they were on the same network. Best for: Teams that want a simple, low-cost network link with low latency. VPC Peering has no per-hour charge (you only pay for data transfer) and supports full-bandwidth communication between VPCs. How it works:
  • A peering connection is established between your VPC and BonData’s VPC
  • Route tables on both sides are updated to direct traffic through the peering link
  • Your RDS security group is updated to allow inbound connections from BonData’s CIDR range
VPC Peering requires both VPCs to be in the same AWS region or use inter-region peering. CIDR ranges must not overlap.

Set up VPC Peering

Contact our team to exchange VPC details and coordinate the peering connection.

Option 5: Site-to-Site VPN

An AWS Site-to-Site VPN creates an encrypted IPsec tunnel over the public internet between your network and BonData’s infrastructure. Best for: Organizations that already have VPN infrastructure or need to connect from on-premises networks (not just AWS). Also useful when VPC Peering isn’t possible due to overlapping CIDR ranges. How it works:
  • A Virtual Private Gateway is attached to your VPC
  • An IPsec tunnel is established between your gateway and BonData’s endpoint
  • All traffic is encrypted and routed through the tunnel
  • Supports both static and dynamic (BGP) routing

Set up a VPN connection

Contact our team to exchange gateway details and configure the VPN tunnel.

Option 6: AWS Direct Connect

AWS Direct Connect provides a dedicated physical network connection (1 Gbps or 10 Gbps) between your infrastructure and BonData, bypassing the public internet entirely. Best for: Enterprise environments with very high data volumes, strict latency requirements, or regulatory mandates for dedicated connectivity. Direct Connect provides the most consistent throughput and lowest latency of any option. How it works:
  • A physical cross-connect is established at an AWS Direct Connect location
  • A dedicated Virtual Interface (VIF) routes traffic between your network and BonData
  • Traffic never touches the public internet — ideal for large-scale, continuous data sync
Direct Connect typically takes 2-4 weeks to provision and involves coordination between your network team, AWS, and BonData.

Set up Direct Connect

Contact our team to discuss your throughput requirements and coordinate the connection.