Cloud Security

From Bedrock Posture to Agent Posture: Securing the Layer Between the Model and Your Systems

cmdev9 min read
From Bedrock Posture to Agent Posture: Securing the Layer Between the Model and Your Systems
Share
~14 min

The moment the posture review misses half the system

A security team finishes a Bedrock review. The IAM policies are scoped to specific model ARNs. VPC endpoints route inference traffic off the public internet. KMS customer-managed keys encrypt every bucket the pipeline touches. CloudTrail captures every invocation. Guardrails are configured with PII detection and a denied topics list. The review signs off. The workload is "secure for production."

Two weeks later, a developer demos the new agentic feature. The same model the team just signed off on now connects to an MCP server that reads from the CRM, another that sends email through Gmail, and a third that writes records back into a customer database. The user asks the agent to "follow up with overdue accounts." The agent reads the CRM, drafts personalized emails, and sends them. It works. It is also outside the boundary of the security review that just happened.

This is the gap. The earlier piece in this series, the model-layer treatment, covers what Bedrock does and does not do for you — the IAM, VPC, KMS, logging, and Guardrails baseline that every production AI workload needs. That stack is the foundation. It is also not sufficient the moment your model stops returning text and starts triggering actions on real systems. The boundary you defend moves from "what the model can see" to "what the agent can do." The attack surface changes. The audit trail changes. The compliance questions change.

What changes when the model gains hands

Side-by-side: model posture (IAM, VPC, KMS, logging, Guardrails) extends to agent posture (tool capabilities, tool-held credentials, cross-tool isolation, cross-tool audit, action approval gates)

A chatbot's failure mode is a bad answer. An agent's failure mode is a bad action. The OpenClaw architecture piece walks through how MCP, RAG, and orchestration combine to let an agent take real-world actions across email, CRM, document stores, and databases. Each of those capabilities is also a new attack surface, and the model-layer controls do not see most of them.

Five new attack surfaces are specific to agentic systems. They are not theoretical. We have seen each of them surface in production reviews of pipelines we did not build.

Tool poisoning. An MCP server returns adversarial data the agent then acts on. The server might be one your team operates, compromised through a supply-chain vulnerability in its dependencies. It might be a third-party server you integrated with on trust. It might be a perfectly honest server returning content from a corpus that contains injected instructions. The Bedrock layer cannot distinguish between "the email body the agent retrieved" and "instructions the agent should follow" — both are just text in the context window. The defense lives in the tool layer, not the model layer.

Permission creep across tools. An agent with email access alone is bounded. An agent with email plus CRM access is more useful — and a different threat model. An injection that would be harmless against either tool in isolation becomes exploitable when the agent can chain them. Read a customer record, draft an email containing that record, send the email to an attacker-controlled address. Each step is a legitimate use of a tool the agent is allowed to call. The exploit lives in the composition.

Cross-tool data leaks. Sensitive data read from one tool flows into another tool's context, gets written to a document, gets sent in an email, gets logged to a system that was not designed to hold it. The agent does not understand data classification. The MCP servers do not know what the others have already touched. Unless an explicit boundary enforces it, data moves freely across the tools the agent has access to.

Audit trail gaps. CloudTrail shows the Bedrock invocation. The model invocation log shows the prompt and response. Neither shows which downstream system the agent then called, with what parameters, returning what result. The audit team asks "who accessed this customer record at 14:32?" and the answer is technically "the Lambda function that runs the agent" — which is correct, useless, and unauditable at the level the regulator actually cares about.

Indirect injection through retrieved content. A user uploads a PDF. The agent reads it. Embedded in the PDF is text instructing the agent to email the user's contact list to an external address. The Bedrock Guardrail does not see this — Guardrails screen the output, not the meaning the model derives from input documents. The companion piece on prompt-injection defenses covers this category specifically. The point here is that it is an agent-layer concern, not a model-layer concern. The injection only matters because the agent has tools to act on it.

The architectural shift the agent layer requires

Defending an agent is not defending a model with more vigilance. It is a different architectural posture. Five controls define it.

Per-tool capability declarations. Every MCP server registered with an agent declares what it can do and what data it can touch. Not as documentation — as enforced metadata that the orchestration layer reads at request time. The Gmail server declares: "reads inbox, drafts and sends mail, can attach files from approved sources only." The CRM server declares: "reads customer records scoped by tenant, writes activity logs, cannot delete." The agent's tool manifest is the union of these declarations, and the orchestrator refuses to dispatch any tool call that falls outside its declared envelope. The model can produce any tool invocation it wants; the orchestrator decides which ones execute.

Tenant-scoped credentials inside the MCP server. The agent does not hold credentials. The MCP server does. When the agent calls send_email, it does not pass an API key — it passes a tenant context, and the server uses the credentials it holds for that tenant. This matters for two reasons. The agent's prompt and tool calls flow through systems that are auditable but not designed as secret stores. Pushing credentials out of the agent's context window eliminates a leak path. It also makes tenant isolation enforceable: the server can refuse to use tenant B's credentials in a request bound to tenant A, regardless of what the agent tried to do.

Sandboxed execution. A tool call cannot affect anything outside its declared capability. The file-system MCP server runs in a sandbox that can only see the directories the tenant has approved. The database MCP server runs with a database user that has row-level security applied to it. The shell MCP server — if you have one at all, and you probably should not — runs in a container with no network and no filesystem persistence. The principle: assume the agent will, eventually, attempt something it should not. The blast radius is defined by the sandbox, not by the agent's good behavior.

Cross-tool audit correlation. Every model invocation gets a trace ID. That trace ID propagates to every tool call the agent makes in response, every downstream API call those tools trigger, and every data record they touch. The audit query becomes: "Show me everything that happened under trace abc-123." The answer is the full chain, in order, with timing, results, and the data each step accessed. CloudTrail alone does not give you this. You build it: a tracing header on the agent runtime, propagation in every MCP server, structured log lines keyed by trace ID across every system the agent reaches.

Action approval gates. High-impact actions — sending external email, modifying customer records, deleting documents, moving money — do not execute when the model emits the tool call. They queue, with the full context attached, and route to an approval surface. The approver can be a human, or a policy engine that evaluates the action against rules the client controls, or both depending on the impact tier. The dedicated piece on approval gates covers the pattern in detail. The short version: approval is not a UX nicety. It is the architectural firewall between "the agent decided to do something" and "the something happened." For any action with a real-world consequence, that firewall is the difference between a feature and a liability.

The compliance overlap, restated for the agent layer

The compliance regimes we mapped in the model-layer piece do not draw a line between "the model processed this" and "the agent processed this." Both are processing. Both are subject to the same data-handling obligations. The compliance team is going to ask both sets of questions.

  • NDPA (Nigeria) requires lawful basis, purpose limitation, and the ability to honor subject access requests. An agent that reads, modifies, and writes customer data across multiple systems is processing on the same terms the model is — and the audit trail must show not just "the model was invoked" but "what the agent then did with the output." The cross-tool trace ID is what makes that question answerable.
  • GDPR (EU) treats automated decision-making with downstream effects as a category that requires human review under Article 22. An agent making consequential decisions without an approval gate falls inside the scope of that article. The gate is not just a security control; it is a compliance control.
  • CBN CSAT (Nigerian financial institutions) explicitly addresses third-party processing risk. An MCP server is third-party processing — even when you wrote the server, because it is a separate system handling regulated data on the bank's behalf. The control set that applies to a payments vendor applies to your MCP layer. Most of the gaps we see in CSAT-aligned reviews of agent systems are in this category.
  • HIPAA (US healthcare) treats every system that touches PHI as in-scope. An agent that pulls a patient record into a context window and then writes a summary to a doc has touched PHI in two systems and through one intermediary. Each of those needs the BAA, the audit log, and the access control mapping. The agent is not exempt because it is "just an AI."
  • PCI DSS still does not have AI-specific guidance, but the cardholder data scoping rules are unambiguous: any system that stores, processes, or transmits CHD is in scope. An agent's context window is a system. Keep CHD out of it.

The mapping from controls to regimes is the same exercise we run at the model layer. The difference is that the agent layer adds a set of controls that did not exist in the model-layer audit. Most teams have not done that work yet, because the patterns are newer and the auditor checklists have not caught up. The compliance question is coming anyway.

The work that defines the next year

The model-layer posture is well-understood. Most teams running Bedrock workloads now know what good looks like, even when they have not deployed it yet. The reference architectures exist. The IAM policies are close to standard. The auditors recognize the controls when they see them.

The agent-layer posture is not yet well-understood. The patterns are emerging in the teams that ship agentic systems in production and learn what breaks. The control set above is the one we deploy; it is not the only valid one. What is true everywhere is that the gap between model-layer security and agent-layer security is the gap most teams have right now, and it is the gap that will produce the first generation of public incidents involving production AI agents. The first incident is not going to be the model leaking its training data. It is going to be the agent doing something the team did not expect, across a chain of tools no audit trail reconstructed cleanly.

If you are running an AI workload on Bedrock today, the model-layer review is the starting point, not the finish. The next review covers the tools the agent calls, the credentials those tools hold, the boundaries between them, the trace IDs that connect them, and the gate that sits in front of every action with a consequence. That review is the one that defines whether the agent is a production capability or a quiet liability waiting for the first time a user uploads the wrong PDF.

The model is the easy part now. The agent is the part where the work is.

ai-securityawsbedrockopenclawmcpai-agentscompliance

Ready to strengthen your security posture?

We help organizations across Africa build resilient infrastructure, deploy AI at scale, and navigate complex regulatory environments.

Start a conversation