The Anatomy of a Private GPT: Architecting for SOC2 in Banking
Why public chatbots fail audits. A deep dive into the AWS Bedrock + VPC Endpoint + Private Subnet topology that passes banking compliance.
Public chatbot APIs are a non-starter in regulated banking. The prompts your relationship managers type are material non-public information. The model responses are bank-attributable advice. Sending either over the open internet to a multi-tenant inference endpoint is a SOC2 finding waiting to happen.
This post walks through the reference topology I deploy for Tier-1 banks who want the productivity gains of GPT without the audit failures.
The threat model auditors actually care about
Three questions every auditor asks:
- Where does the prompt go? If the answer involves "the public internet" or "a vendor's shared inference cluster", you're done.
- Where do the embeddings live? Vector databases are PII goldmines.
- Can you produce the access log for any inference? "We have CloudWatch on" is not an answer.
The topology
┌──────────────────────────────────────────────────┐
│ Private Subnet (no IGW, no NAT) │
│ │
│ App ──► VPC Interface Endpoint ──► Bedrock │
│ (com.amazonaws.<region>.bedrock-runtime)
│ │
│ App ──► VPC Gateway Endpoint ──► S3 (KMS) │
│ App ──► VPC Interface Endpoint ──► KMS │
└──────────────────────────────────────────────────┘
Traffic to Bedrock never leaves AWS's private backbone. The subnet has no NAT gateway, so even a compromised container cannot exfiltrate via DNS.
Guardrails are not optional
Bedrock Guardrails do three things every bank needs:
- PII redaction before the prompt reaches the model.
- Topic denylist to refuse "give legal/medical advice" prompts.
- Output filtering for hallucinated account numbers.
Pair them with CloudTrail data events on Bedrock InvokeModel and you have the audit log your CISO will sign off on.
What this replaces
- Public ChatGPT Enterprise: fails residency, fails the "prove the prompt log" test.
- Self-hosted on EC2 with public IP: solves residency, fails on patching cadence.
- Fargate + VPC endpoints to Bedrock: solves both, and the auditor can see the IAM role that invoked each call.
That last property — who invoked which model with which prompt, attributed to a named human via Cognito — is the entire point of Sovereign AI.