The Anatomy of a Private GPT: Architecting for SOC2 in Banking

Why public chatbots fail audits. A deep dive into the AWS Bedrock + VPC Endpoint + Private Subnet topology that passes banking compliance.

April 10, 2026·2 min read·

#PrivateGPT#Architecture#SOC2#Bedrock

Public chatbot APIs are a non-starter in regulated banking. The prompts your relationship managers type are material non-public information. The model responses are bank-attributable advice. Sending either over the open internet to a multi-tenant inference endpoint is a SOC2 finding waiting to happen.

This post walks through the reference topology I deploy for Tier-1 banks who want the productivity gains of GPT without the audit failures.

The threat model auditors actually care about

Three questions every auditor asks:

Where does the prompt go? If the answer involves "the public internet" or "a vendor's shared inference cluster", you're done.
Where do the embeddings live? Vector databases are PII goldmines.
Can you produce the access log for any inference? "We have CloudWatch on" is not an answer.

The topology

┌──────────────────────────────────────────────────┐
│ Private Subnet (no IGW, no NAT)                  │
│                                                  │
│   App ──► VPC Interface Endpoint ──► Bedrock     │
│            (com.amazonaws.<region>.bedrock-runtime)
│                                                  │
│   App ──► VPC Gateway Endpoint    ──► S3 (KMS)   │
│   App ──► VPC Interface Endpoint  ──► KMS        │
└──────────────────────────────────────────────────┘

Traffic to Bedrock never leaves AWS's private backbone. The subnet has no NAT gateway, so even a compromised container cannot exfiltrate via DNS.

Guardrails are not optional

Private networking is only the first control. The production design also needs model access policies, prompt logging, KMS ownership, tenant boundaries, and evidence that every inference can be traced to an authenticated caller.

For Bedrock deployments, start with the Amazon Bedrock security documentation, then map each control to the audit question it answers. If a control cannot be explained to an auditor in one sentence, it probably is not ready for production.

Bedrock Guardrails do three things every bank needs:

PII redaction before the prompt reaches the model.
Topic denylist to refuse "give legal/medical advice" prompts.
Output filtering for hallucinated account numbers.

Pair them with CloudTrail data events on Bedrock InvokeModel and you have the audit log your CISO will sign off on.

What this replaces

Public ChatGPT Enterprise: fails residency, fails the "prove the prompt log" test.
Self-hosted on EC2 with public IP: solves residency, fails on patching cadence.
Fargate + VPC endpoints to Bedrock: solves both, and the auditor can see the IAM role that invoked each call.

That last property — who invoked which model with which prompt, attributed to a named human via Cognito — is the entire point of Sovereign AI.

Closing thought

A Private GPT is not a model choice. It is a control surface — identity, network, logging, key management, and prompt governance all wired together so that an auditor can trace one inference call back to one human in one role. Build that surface once, and every future model swap becomes a configuration change instead of a compliance re-review.

Ask AI About the Author

Open this query in ChatGPT, Claude, or Perplexity.

ChatGPT

Best for structured summaries.

Claude

Useful for concise synthesis.

Perplexity

Good for web-backed lookup.

Comments

Comments are open to confirmed email subscribers. Use the email you subscribed with. To edit a comment, delete it and post a new one.

Get new field notes by email

Field notes from someone who ships before they write about it. Sovereign AI, AI-SDLC, DevOps, and what 59 production deployments teach you. No spam. Unsubscribe anytime.

Related field notes

Sovereign AI·5 min read

The Anatomy of a Private GPT: Architecting for SOC2 in Banking

The threat model auditors actually care about

The topology

Guardrails are not optional

What this replaces

Closing thought

Further reading

Ask AI About the Author

Comments

Get new field notes by email

Related field notes

Sovereign AI Data Center: Definition, Architecture, and Compliance Blueprint

The Hidden Costs of AI: Preventing Token Shock in AWS Bedrock

From Prompt to Production: The Golden Path for Secure GenAI Apps

Sovereign AI on Metal: Air-Gapped LLM Stack with Ubuntu & vLLM

Why AI Product Development Still Needs Architecture Thinking