Case Study

Case Study: Compliance Automator — Building Audit-Grade AI for Regulated Markets in Public

Mayowa A.9 min read
Case Study: Compliance Automator — Building Audit-Grade AI for Regulated Markets in Public
Share
~13 min

An operator-grade case study from the CreativeMinds Development (cmdev) AI engineering practice. The companion open-source repository is live: github.com/Samueladewole/compliance-automator.

The buying question this answers

The regulator's letter arrives on a Tuesday. The subject line is polite — "Request for cybersecurity governance evidence under NCPS 2021 CNII obligations, Q1 2026." The body asks for a list of specific evidence: privileged-access changes in production over the prior quarter, the approval trail for each, the IAM policy diffs that produced them, the audit log entries that confirm execution, the regulatory-control mapping. Response deadline: fourteen days.

The compliance team starts on Wednesday. By the end of week one, they have an extract from CloudTrail Lake covering the right time window. By the middle of week two, they have a draft mapping of events to NDPA Section 39 and CBN CSAT control IDs. By Friday afternoon, an analyst is hand-formatting the evidence pack into a PDF the regulator will accept. The team works the weekend. The pack ships on day twelve.

It will happen again next quarter. And the quarter after. Every regulated enterprise in Africa and the EU is now in a steady-state cadence of regulator queries — NDPA processing-record requests, CBN CSAT examination cycles, NMDPRA posture reviews, NIS2 Article 21 evidence requests, GDPR data-subject inquiries, sector-regulator one-off probes. The work is repetitive, time-pressured, expert-intensive, and a poor use of senior compliance time.

The compliance-automator is our open-source answer. A regulator's evidence query goes in. A structured, citation-rich, audit-grade evidence pack comes out — in twelve minutes, not twelve days. This piece is the case study: what we are building, the architecture, the repo, the build roadmap, and the engineering decisions a CISO needs to see before trusting the output.

What it actually does

A side-by-side comparison portal. Left pane: the regulator's query, or a draft policy / contract / operational document. Right pane: the AI agent's response — non-compliant clauses highlighted, deep-linked to the exact page and section of the official government regulation, with the evidence pack auto-generated underneath.

Three concrete examples that exercise the pipeline end-to-end:

  • "Show me all privileged-access changes in production for the past 90 days, with the approval trail." The agent queries CloudTrail Lake for IAM policy changes against production-scoped resources, joins them with the approval workflow records, maps each change to NDPA Section 39 / CBN CSAT controls, returns a PDF evidence pack with citations.
  • "Review this draft third-party vendor contract for NIS2 Article 21 supply-chain compliance." The agent retrieves the relevant NIS2 supply-chain clauses, highlights mismatches in the draft contract, deep-links each finding to the regulatory source.
  • "Produce the quarterly CSAT board-level evidence pack for Q1 2026." The agent runs the standing CSAT query set against the bank's evidence sources, formats the output to the regulator's preferred template, signs the artefact with KMS.

Each of these is a query type we have heard from real compliance officers as the thing that eats their week.

The architecture

Compliance Automator architecture: User submits a regulator-style query through the comparison portal UI or the local CLI. The query enters a Strands agent running on AgentCore Runtime, deployed inside the customer's VPC per the air-gapped Bedrock pattern. The agent uses Claude Haiku as a router (decides which evidence sources to query), then Claude Sonnet for the synthesis step. Action tools: query_cloudtrail (CloudTrail Lake), query_security_lake, retrieve_regulation (Bedrock Knowledge Base over the regulatory corpus with Cohere Embed v3), generate_evidence_pack (Powertools Lambda emitting structured JSON), format_pdf (WeasyPrint via Lambda layer). Bedrock Guardrails wrap every invocation with PII filters (NIN, BVN regex), denied-topic policies, and contextual grounding. event.interrupt() gates the evidence_pack generation for human-in-the-loop sign-off on the final output. Output: structured evidence pack JSON + signed PDF + citation index. Observability: CloudTrail data events, model invocation logs to KMS-encrypted S3 with Object Lock, cost tags per workflow.

Every component is documented in docs/architecture.md in the repo. The architecture composes pieces from the prior cmdev articles:

  • Air-gapped Bedrock deployment — the air-gapped pattern is the substrate. The agent never sends customer data outside the customer's VPC. PrivateLink endpoints to bedrock-runtime, bedrock-agent-runtime, KMS, S3, and Secrets. CMK encryption on every persistent artefact. CloudTrail data events forwarded to a separate Security OU account.
  • Evaluation harnessthe eval-driven engineering pattern ships alongside the agent. 300-item golden set across the three regulatory regimes, LLM-as-judge calibration against human SME labels, drift detection in production.
  • Strands + AgentCore — the open-source agent harness from Part 3 of the Bedrock series. Hooks for audit, steering handlers for safety, event.interrupt() gates on the evidence-pack generation step.
  • Multi-model routing — Claude Haiku as the router that picks evidence sources; Claude Sonnet for synthesis; Cohere Embed v3 and Rerank v3 for retrieval. The Bedrock cost-optimization pattern keeps per-query cost predictable.
  • Security + observability — Guardrails wrap every invocation with the PII filters Part 6 documents, plus a custom denied-topic for "production-mutating-action-without-approval." Model invocation logs are the regulatory artefact.

What's in the repo right now

The repository is live at github.com/Samueladewole/compliance-automator. Current shape:

git clone https://github.com/Samueladewole/compliance-automator
cd compliance-automator
make install
make run     # returns a structurally valid scaffold evidence pack

Shipped now:

  • README with quickstart, repo structure, and live status table
  • LICENSE (MIT)
  • Python project skeleton (pyproject.toml, Makefile, ruff + mypy + pytest configured)
  • agent/cli.py — runnable CLI returning a valid-shape scaffold evidence pack so the end-to-end path is exercisable from day one
  • docs/architecture.md — system overview, component map, ADR index
  • docs/local-aws-setup.md — Bedrock model access, IAM, region selection, cost expectations
  • terraform/README.md and cdk/README.md — parallel infrastructure-as-code roadmaps
  • Folder structure ready to populate (agent/tools/, agent/hooks/, agent/prompts/, eval/, data/regulations/, data/synthetic/)

Building toward:

  • Working Strands agent in agent/pipeline.py with the five action tools wired
  • Terraform modules and CDK constructs for the full air-gapped deployment
  • 300-item evaluation golden set with LLM-as-judge harness and signed monthly PDF for the regulator
  • Side-by-side comparison portal web UI (Next.js)
  • Public regulatory corpus: NDPA 2023, CBN CSAT extracts (where publicly available), EU NIS2 Article 21, NIST SP 800-53 subset
  • Synthetic CloudTrail + Security Lake data so end-to-end runs work without prod data

Target ship date: end of June 2026. Watch the repo or creativeminds.dev/blog for the milestone announcements.

Why we are building it in public

Three reasons, all of which a CISO will recognise as the right shape of trust signal:

1. The architecture is the trust signal. A CISO does not buy a compliance system based on a vendor's marketing deck. They buy based on reading the architecture, asking whether the security properties hold under their threat model, and watching the deployment behave under real load. An open-source repo is the architecture, fully visible, immediately auditable. Buying decisions accelerate when the code is open.

2. The buyer's data never leaves the buyer's tenancy. The compliance-automator deploys inside the customer's AWS account using the air-gapped Bedrock pattern. There is no cmdev-hosted SaaS to send queries through. The customer's CloudTrail Lake, Security Lake, IAM events, and Knowledge Base of policies stay within their cryptographic boundary. The open-source architecture is what makes this credible — the customer can read every line of what touches their data.

3. Build-in-public compounds. Every commit, every ADR, every eval-result publication is a signal that cmdev is doing real engineering. The repo's commit history is a continuous credibility surface that no marketing campaign can match. For a consulting / project-delivery practice, this kind of asset compounds — by the time a buying conversation reaches a deal review, the buyer has already evaluated us on the work.

The four-week build roadmap

The work is sequenced to ship a working end-to-end agent by end of June 2026:

Week Milestone Repo signal
Week 1 (this week) Repo bootstrap, architecture documented, sample regulatory corpus ingested Scaffold + first ADRs land
Week 2 Strands agent + Knowledge Base wired against real Bedrock + Cohere; CloudTrail-query and retrieve-regulation tools shipped make run returns a real evidence pack against synthetic data
Week 3 Terraform modules and CDK constructs deployable to a fresh AWS account; air-gapped pattern validated terraform apply produces a working deployment in a clean test account
Week 4 300-item golden set + LLM-as-judge eval harness + drift detection; PDF evidence-pack template make eval produces the audit-grade quality report

Each weekly milestone is a tagged release in the repo. Each ships with a short blog post in this case-study series — what we built, what surprised us, what we engineered past.

What this teaches us about enterprise scaling — so far

Three things have already surfaced in the bootstrap week that warrant flagging for the buying audience:

1. The repo structure matters more than the architecture diagram. A CISO evaluating an open-source compliance tool will spend their first ten minutes in the repository. The first ten minutes need to convey: clear quickstart, honest status table (what is shipped vs. what is not), runnable scaffold so the end-to-end path is exercisable from day one. We rewrote the README three times in week one to land the structure that doesn't waste those ten minutes.

2. Building parallel Terraform and CDK costs more than either alone — but the cost is small and the trust signal is large. Most teams have a strong preference. Shipping both means meeting them where they are. The cost shows up in maintaining two infrastructure expressions of the same system, which we are mitigating by treating Terraform as the canonical source and CDK as the synthesised equivalent (with terraform plan snapshots committed against the CDK synth output as a regression check).

3. The "case study" framing is itself the wrong product framing. The compliance-automator is not a one-off engagement we documented after the fact. It is a reference implementation that we and our customers can fork. The case-study article you are reading is a snapshot of an evolving product, not a retrospective. We are adjusting the cmdev publishing voice to reflect this — the reference architecture series and this case study compose as the front door of a working open-source practice, not as portfolio items.

How to engage

Three concrete moves you can make if this is the shape of work your team needs:

Star the repo. github.com/Samueladewole/compliance-automator — the star count is the public signal of demand and helps the project compound. Watching gives you the milestone-release notifications without inbox noise.

Read the air-gapped Bedrock article and the eval harness article. They are the architectural substrate. If those resonate with your CISO and compliance leadership, the compliance-automator is the natural fit.

Email [email protected] for a deployment consultation. We engage with regulated enterprises in Africa and the EU on a four-phase model (diagnostic → foundation build → co-managed operations → optional MSSP). The compliance-automator is the substrate; the engagement is what makes it production at your scale. Direct, no sales fluff.


Companion content

Mayowa A. is CTO of CreativeMinds Development. CreativeMinds Development (cmdev) ships production AI for regulated enterprises across Africa and the EU.

case-studyamazon-bedrockstrandsagentcoreclaudecompliancendpacbn-csatnis2open-sourceai-agentsregulated-enterprise

Ready to strengthen your security posture?

We help organizations across Africa build resilient infrastructure, deploy AI at scale, and navigate complex regulatory environments.

Start a conversation