Sovereign AI

The Anatomy of a Private GPT: Architecting for SOC2 in Banking

Why public chatbots fail audits. A deep dive into the AWS Bedrock + VPC Endpoint + Private Subnet topology that passes banking compliance.

·2 min read·
#PrivateGPT#Architecture#SOC2#Bedrock

Public chatbot APIs are a non-starter in regulated banking. The prompts your relationship managers type are material non-public information. The model responses are bank-attributable advice. Sending either over the open internet to a multi-tenant inference endpoint is a SOC2 finding waiting to happen.

This post walks through the reference topology I deploy for Tier-1 banks who want the productivity gains of GPT without the audit failures.

The threat model auditors actually care about

Three questions every auditor asks:

  1. Where does the prompt go? If the answer involves "the public internet" or "a vendor's shared inference cluster", you're done.
  2. Where do the embeddings live? Vector databases are PII goldmines.
  3. Can you produce the access log for any inference? "We have CloudWatch on" is not an answer.

The topology

┌──────────────────────────────────────────────────┐
│ Private Subnet (no IGW, no NAT)                  │
│                                                  │
│   App ──► VPC Interface Endpoint ──► Bedrock     │
│            (com.amazonaws.<region>.bedrock-runtime)
│                                                  │
│   App ──► VPC Gateway Endpoint    ──► S3 (KMS)   │
│   App ──► VPC Interface Endpoint  ──► KMS        │
└──────────────────────────────────────────────────┘

Traffic to Bedrock never leaves AWS's private backbone. The subnet has no NAT gateway, so even a compromised container cannot exfiltrate via DNS.

Guardrails are not optional

Bedrock Guardrails do three things every bank needs:

  • PII redaction before the prompt reaches the model.
  • Topic denylist to refuse "give legal/medical advice" prompts.
  • Output filtering for hallucinated account numbers.

Pair them with CloudTrail data events on Bedrock InvokeModel and you have the audit log your CISO will sign off on.

What this replaces

  • Public ChatGPT Enterprise: fails residency, fails the "prove the prompt log" test.
  • Self-hosted on EC2 with public IP: solves residency, fails on patching cadence.
  • Fargate + VPC endpoints to Bedrock: solves both, and the auditor can see the IAM role that invoked each call.

That last property — who invoked which model with which prompt, attributed to a named human via Cognito — is the entire point of Sovereign AI.

More in Sovereign AI