The AWS Security Posture for AI Workloads: What Bedrock Does Not Do for You

The assumption that lets things slip

A team picks Bedrock for their next AI feature because it is "the secure option." The data does not leave AWS. The model is managed by Amazon. The compliance team has heard of it. The first phase of the work is the prompt and the pipeline; the security review is scheduled for later.

Three months in, the security review starts asking questions the team did not prepare for. Who can invoke this model in production? How is the inference data encrypted at rest? What is the audit trail for every prompt that touched a customer record? If a developer rotated out of the project last week, can they still call the API? When the model returned a response that included another tenant's data — and yes, this happened — what stopped it from being logged in the wrong CloudWatch group?

Bedrock handles the model. It does not handle the security perimeter around the model. That perimeter is still your responsibility, and most of the gaps that bite production AI workloads are in the same five places every time. Here is the baseline we deploy.

The five controls that decide whether a Bedrock workload passes audit

Bedrock surrounded by five nested control layers: IAM, VPC endpoint, KMS, audit logging, Guardrails. Maps to NDPA, GDPR, HIPAA, PCI, CBN CSAT.

1. IAM: invocation as an explicit, scoped permission

The default Bedrock IAM policy is too broad for any real workload. bedrock:InvokeModel on Resource: "*" lets a principal call every model in the region. In a compliance setting, that is the same as saying "anyone on the team can invoke anything." It does not pass a CBN CSAT review and it should not pass yours.

The pattern that holds up:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
    "Resource": [
      "arn:aws:bedrock:eu-west-1::foundation-model/anthropic.claude-sonnet-4-6-v1:0",
      "arn:aws:bedrock:eu-west-1::foundation-model/cohere.embed-english-v3"
    ],
    "Condition": {
      "StringEquals": { "aws:RequestedRegion": "eu-west-1" },
      "StringLike": { "aws:PrincipalTag/workload": "production-ai-*" }
    }
  }]
}

The model ARNs are explicit. The region is locked. Only principals tagged with a workload value matching the production pattern can invoke. Development principals run against a separate policy that targets development models in a separate account.

This is also where you enforce identity-based segregation: the Lambda function that handles user-facing inference has one role. The batch processor that runs nightly summarization has another. The two roles cannot invoke each other's models, cannot read each other's S3 buckets, and cannot write to each other's logs. If one role is compromised, the blast radius stops there.

2. VPC endpoints: keep inference traffic off the public internet

A Lambda function calling Bedrock through the public AWS endpoint sends inference data over the open internet to reach AWS — even though both ends are inside AWS. The traffic is TLS-encrypted, but the route is not what your compliance team thinks it is, and it shows up in any data-flow diagram that an auditor draws.

A VPC interface endpoint for Bedrock fixes this. Create the endpoint in the VPC where your AI workloads run, attach an endpoint policy that mirrors the IAM scoping above, and configure the Lambda functions or ECS tasks to use private DNS. The inference traffic now stays inside the AWS backbone, never traverses a NAT gateway, and shows up cleanly in VPC flow logs.

For Nigerian and EU clients with strict data-residency requirements (NDPA, GDPR), the endpoint policy is also where you enforce region pinning at the network layer, not just at the IAM layer. Two locks on the same door.

3. KMS: customer-managed keys for the data Bedrock touches

Bedrock does not store your prompts or your responses by default. It does, however, often surround a pipeline that does — S3 buckets holding the documents you embed, DynamoDB tables holding pipeline state, OpenSearch indexes holding the vector store, CloudWatch logs holding the trace data. Every one of those resources is encrypted by AWS by default. The question is who controls the key.

The default is AWS-owned keys, which are encrypted and audited but managed entirely by AWS. The audit team will accept this only for data classified as non-sensitive. The pattern for any workload touching customer data, PII, or financial information:

A dedicated KMS Customer Managed Key (CMK) per environment (production, staging, development each get their own)
Key policies that grant decrypt access only to the specific IAM roles that need it
Automatic key rotation enabled
CloudTrail logging on every key usage event

For multi-tenant pipelines, take this further: one CMK per tenant, with the key policy granting decrypt access only to the role invoked on that tenant's behalf. If your pipeline ever returns one tenant's data while serving another tenant's request, the cross-tenant access fails at the KMS boundary before any data is rendered. This is the kind of defense that does not feel necessary until the day it stops a quiet incident from becoming a public one.

4. Logging: every invocation auditable, no prompts in the wrong place

Three categories of logging matter, and each has a place where it should and should not live.

CloudTrail captures every Bedrock API call: who invoked, when, from which IP, with which model ID, returning what status. It is the audit trail for the fact of an invocation. It does not contain the prompt or the response, which is the right design — those belong elsewhere. CloudTrail logs go to a dedicated S3 bucket in a separate logging account, with object lock enabled and a retention policy that satisfies your regulatory floor.

Application logs capture the prompt and the response (or selectively, what you decide to store). This is where teams routinely make mistakes. Logging the raw prompt is fine if the prompt does not contain customer PII. The moment it does, the application logs become a duplicate copy of regulated data, in a system that was not designed to hold regulated data. The fix is to log a redacted prompt — replace email addresses, account numbers, and other identifiers with hashes before the log line is written — and store the original prompt in a separate, KMS-encrypted, access-controlled store keyed by the same hash. The trace is reconstructible. The data residency posture is not violated.

Bedrock model invocation logging, configured at the account level, captures the full prompt and response and writes them to S3 or CloudWatch Logs. This is useful for audit and red-team replay, but the same caveats apply: if you turn it on, the destination must be encrypted with a CMK, restricted to a small set of investigators, and never exposed to the application accounts that called the model.

5. Guardrails and prompt-injection defense

Bedrock Guardrails is the layer that screens prompts and responses against a configurable policy: PII detection, denied topic lists, harmful content filtering, and contextual grounding checks. Most teams skip it on the assumption that "we control the prompts." That assumption breaks the first time a user-facing pipeline accepts a query, document, or transcript that contains injected instructions.

The minimum we deploy on every user-facing AI workload:

A Guardrail with PII detection enabled, blocking the response if the model output contains an unredacted email, phone number, or account number that did not appear in the input
A denied topics list specific to the workload — for a customer-support agent, that includes any topic outside the agent's mandate
Contextual grounding checks that fail the response if it makes claims not supported by the retrieved context (the retrieval pipeline pairs naturally with this)

For workloads that accept user-uploaded documents (a RAG pipeline over customer-submitted PDFs, for instance), the guardrail is the difference between an AI feature that occasionally answers a strange question and an AI feature that quietly executes a prompt injection embedded in a malicious PDF.

Multi-tenant isolation: the architecture decision that prevents the worst incidents

If your AI workload serves more than one customer, the data isolation pattern decides whether a bug becomes an incident or a breach.

The pattern we deploy:

One IAM role per tenant, assumed via STS at request time based on the authenticated principal. The role's policies scope every resource access to that tenant's data.
Tenant-keyed encryption, where each tenant's data is encrypted with a tenant-specific KMS key. A pipeline that pulls a document for tenant A and somehow tries to encrypt it with tenant B's key fails at the KMS layer rather than at a more visible (and more painful) layer downstream.
Tenant ID in every log line, every CloudWatch metric dimension, and every DynamoDB partition key. When you investigate a cross-tenant event, you do not have to reconstruct which tenant was involved — the data is already structured around it.
Tenant-specific Bedrock invocation logging (when enabled), so the model audit trail can be filtered to a single tenant for a compliance request or a customer-initiated subject access request.

Most pipelines we audit have some of these. Few have all of them. The first time a customer asks "prove that no other customer's data was processed by the model in response to my queries," the gap is visible.

Each regime has its own requirements; the controls above satisfy the majority of them at once.

NDPA (Nigeria) asks for data residency, access controls, encryption at rest and in transit, breach detection, and the ability to honor subject access and erasure requests. Region pinning + IAM scoping + CMK encryption + structured logging satisfies the first four; the tenant-keyed architecture makes the fifth feasible.
GDPR (EU) asks for the same plus a lawful basis assessment and a data processing record. The architecture does not write the DPIA for you, but the audit trail is already in the form a DPIA references.
CBN CSAT (Nigerian financial institutions) explicitly calls out cloud workload controls, third-party risk, and incident detection capability. The Bedrock posture documented above maps cleanly onto the CSAT control set; what most banks lack is the documentation linking the AWS controls to the CSAT control IDs.
PCI DSS does not yet have AI-specific guidance, but the data-handling controls for cardholder data apply to AI workloads that touch CHD the same way they apply to any other workload. The principle: cardholder data does not enter the prompt at all, and the application redacts it before the inference call.

We map each control above to the relevant control ID in the client's primary compliance regime when we deploy. The work of an audit is not having the controls — it is being able to point to them in a form the auditor recognizes.

What this costs

The IAM and VPC endpoint work is roughly half a day on a new AWS account. The KMS architecture is another day, mostly spent on the key policy reviews and the access matrix. Guardrails configuration is workload-specific — generally a half-day per pipeline. Multi-tenant isolation is a longer engagement; depending on how the data layer is currently structured, it is anywhere from a week (greenfield) to a multi-sprint refactor (retrofit).

The cost is small compared to the cost of doing it after a compliance audit fails or after the first incident. The bill we have actually seen from skipping this work — for a single client whose audit went sideways — was a six-week delay on a production launch plus the consulting fees to fix the architecture under deadline. None of that is in the AWS console; all of it is in the conversation with the security team.

The point of running AI on AWS is not just that the model is managed. It is that the surrounding controls are available, well-understood, and well-tested. Use them. Bedrock will not.