When Your AI Agent Needs a Security Clearance

A dramatic black and white portrait of a person wearing a hat, creating a mysterious mood.

I watched this pattern play out twenty years ago when I was a business relationship manager at Apple.

Executives wanted iPhones — wanted them badly enough that CTOs would come meet with me to figure out how to pry their senior leadership off BlackBerry. Two devices was one too many, and the consumer hardware was winning. Security teams scrambled to figure out how to let people use those phones without opening the network to every threat vector they’d spent a decade locking down. Meanwhile, at the other end of the building, developers were quietly running Linux boxes because the approved Windows environment couldn’t do what they needed.

None of that was really about the technology. It was about who was driving adoption — and whether the organization had the maturity to govern what people were already using.

The same thing is happening with agentic AI right now. Except this time, the thing sneaking into your environment isn’t a phone in someone’s pocket. It’s an autonomous agent spawning sub-agents in milliseconds, querying databases you didn’t know it could reach, making decisions faster than any audit trail can capture.

And just like BYOD, the governance gap here is a discipline problem dressed up as a tools problem.

The Sub-Agents You Don’t See

When a product manager can prompt a working prototype into existence in two days instead of six weeks, that feels transformational. When that prototype starts calling internal APIs and querying customer databases without going through procurement or security review, you’ve got shadow IT running at machine speed.

A coalition of Western governments — including the U.S. and Australia — released guidance in May 2026 warning that agentic AI systems are “capable of autonomously creating, or ‘spawning’, sub-agents to accomplish specific sub-tasks” without continuous human intervention.

Agents being able to do that isn’t surprising. What’s surprising is how few organizations know it’s happening until something breaks.

Traditional audit trails were built for predictable human behavior. You log in, you request access, you perform an action, the system records it. Something goes wrong, you trace back through the chain.

But when an agent spawns three sub-agents to handle compliance checks, transaction monitoring, and documentation drafting — all in the same second — what does the audit trail actually capture? Intent? Reasoning? The decision tree that led to those three actions instead of two or four?

Government security experts note that increased autonomy creates situations where “agents may initiate secondary tasks, spawn sub-agents, or follow extended delegation chains in ways that are not always visible to operators.” Unexpected behavior in one component cascades through the system. Compromised behavior in one component does the same — only faster, and on purpose.

You can’t audit what you can’t see.

OAuth Wasn’t Built for This

A lot of organizations I’m talking to right now are trying to bolt agentic AI onto existing identity and access management systems. They’re using OAuth. They’re using OIDC. They’re defining scopes and permissions the same way they did for human users.

That approach is breaking down.

OAuth and OIDC were designed for predictable delegation. A human authorizes an app, the app requests specific permissions, the human grants them, the app operates within those bounds.

LLM agents don’t operate within those bounds. They generate query patterns that violate least-privilege assumptions by design. They don’t ask for permission to access a specific table — they ask for permission to “retrieve relevant compliance data,” and then they decide what’s relevant in real time.

Security researchers point out that the existing IAM stack wasn’t built for this. Human-centric identity services lack the ability to handle ephemeral agents, authorization at the protocol layer, and end-to-end workflow traceability at scale.

Database security models assume you know what queries will be asked. Agentic AI makes that assumption obsolete. You can’t write a deny-list for queries that don’t exist yet, and you can’t scope permissions for actions the agent will invent tomorrow.

The Governance Gap

When I work with clients in regulated industries where governance failures have immediate regulatory consequences, I see two adoption patterns running in parallel.

From the top down, executives see agentic AI as a way to eliminate busy work. They’re looking at compliance workflows that tie up entire teams and asking whether an agent could handle onboarding checks, transaction monitoring, escalation routing, and regulatory documentation faster and more accurately.

From the middle of the organization, starved development teams suddenly have access to tools that let them prototype in days instead of weeks. A product manager who’s been operating under budget constraints for years can now stand up a working model and iterate on ideas that would’ve been too expensive to test before.

Both patterns are happening faster than governance processes can absorb them.

Yale research found that only 21% of companies surveyed had a mature agentic AI governance model — even though 74% planned to deploy agentic AI moderately or more extensively within two years. The gap between deployment speed and governance maturity isn’t closing. It’s widening.

The capability is there. The discipline isn’t. Most organizations are treating this as a tools problem when the actual work is organizational.

You can sandbox your data. You can negotiate enterprise agreements with vendors. You can define scopes and permissions down to the field level. But if your people don’t understand the risks — if they’re operating from a scarcity mentality where shipping the deliverable is the only thing that matters — negative results are inevitable.

What Runtime Governance Looks Like in Practice

The organizations getting this right aren’t the ones with the most sophisticated tools. They’re the ones engaging their employees in the conversation.

They’re being open about what they hope to achieve and what’s working. They’re communicating their experiments and the results along the way. They’re trusting their people to use these tools responsibly — and then verifying that trust holds.

One pattern I’m seeing right now: companies mining their own existing databases of public information — press releases, blog posts, social media — to program chatbots that reply to queries using only existing approved content. They’re shrinking the chance the agent will hallucinate by constraining what it can access in the first place.

Even that takes discipline.

Security experts recommend a fundamental mindset shift — treat your LLM like untrusted user input. You can’t program all of its responses. You can’t perfectly predict what it’ll do next. Your security has to be built on the assumption that the model will eventually produce unexpected and potentially harmful outputs.

Governance has to operate at runtime, not as a periodic review.

Agents make thousands of access decisions per minute. Quarterly audits can’t keep pace with that volume. You need enforcement at the moment of action, not at deployment or in hindsight.

The organizations I’m watching succeed are doing a few things consistently. They’re asking agents to review their plans up front — using features in Claude and other agentic tools that let you stop, ask questions, work out a whole plan, and require the tool to request permission along the way. They’re role-playing scenarios before deployment. And they’re reinvesting some of the gains from AI adoption back into governance and testing.

I used to think role-playing was a waste of time. Then I worked at Apple, where we used it constantly for exactly this reason. Walking through each step — asking someone to be deliberately difficult so you realize some things aren’t as easy as you envision — surfaces risks you wouldn’t catch otherwise.

If AI is letting you move faster, some of that speed needs to go back into making sure the product you ship is right.

The Human Cost

There’s another piece to this that doesn’t get enough attention — what happens to the workers themselves.

When workers stop checking AI outputs — because they’re overwhelmed, because the tool seems confident — they’re risking errors, sure. The bigger loss is the judgment that made their work valuable in the first place.

I grew up working for my dad. He was a service technician for automotive alignment equipment. If you don’t know what that is, it’s the set of tools that tell you whether the wheels on your car or the frame itself are damaged or bent. Over time, uncorrected, misalignment will cause your car to gradually drift to one side as you’re trying to drive it.

You can have both hands on the wheel — doing everything you think you’re supposed to do. But if the frame is bent, you’re drifting toward the guardrail without realizing it.

That’s what’s at stake when people stop engaging with AI outputs. They lose the uniqueness of their contribution — not just the tasks they perform, but the pattern recognition that made their work valuable in the first place.

If you’re running a team that’s been operating threadbare — if your people are stuck in scarcity mode, just trying to clear the deliverable — bad things are going to happen. They might already be happening.

So as we adopt AI — letting it handle the mundane tasks, compress the research assignments, shrink the time it takes to get to a draft — we have to invest in the training and professional development of the people using it. They can’t just accept whatever the system produces as correct.

They have to expect errors.

Most AI tools have an indicator right in the interface: “this tool is experimental and may produce errors.” We have to take that seriously. We can’t wave it off because the output looks confident.

The Cautionary Tale We’re About to Write

I don’t have a ton of optimism that AI is going to be the tide that lifts all boats. At least not without some painful lessons first.

We’re going to see at least one major calamity. Something is going to be hooked up to something. It’s going to have a negative impact on customers. That may just be the nature of how humans learn — we may need to witness a bad result before we’re motivated to build better guardrails.

How many cars went off the sides of roads before we started building actual guardrails? How many safety features are standard in cars now that weren’t included until we observed the negative impact of not having them?

We’re probably headed for the same kind of frustrating growth period with AI.

Banks in Asia are already increasing scrutiny of AI tools, driven by concerns that advanced models could let attackers identify and exploit vulnerabilities faster than defense teams can patch them. Singapore’s financial regulator is urging banks to plug the holes. South Korea’s government agencies have met to review how to respond to the risks.

They’re recognizing something fundamental — the attack surface isn’t just larger when you deploy agentic AI. It’s different in kind, because the vulnerability scanner is now also an autonomous decision-maker.

Your company doesn’t have to be the one that writes the cautionary tale.

Slow down enough to engage your teams in the conversation about what they hope to achieve and what’s working. Ask your agents to review their plans with you up front. Role-play scenarios before you deploy. Reinvest some of the speed you gain into the governance that keeps the speed from breaking things.

Treat governance as something that happens at runtime — not as an afterthought when something breaks.

The organizations that navigate this well won’t be the ones with the biggest tools budget. They’ll be the ones with the discipline to ask what could go wrong before they find out the hard way.


Need a steadier hand on AI adoption inside your organization? This is exactly the kind of work User Experience Consulting at Johns & Taylor was built for — Fractional Experience Strategy that brings governance, agile practice, and customer-experience discipline into transitions like this one. We step in long enough to establish habits that outlast the engagement and set up your permanent team for long-term success.