Four Fires, One Sprinkler System — A (Fictional) Government IT Director Discovers All Her AI Nightmares Share an Address

Marguerite "Mags" Tremblay runs IT for a state Department of Human Services. She has held the job for fifteen years, which she says qualifies as "a long, slow hostage situation with decent benefits." Before state government, she administered hospital systems. Now she oversees an agency that has deployed AI agents for benefits eligibility screening, citizen services chatbots, and internal workflow automation. She wrote her agency's AI governance plan after the governor issued an executive order modeled on federal guidance. She is now, by her own account, living with the consequences.

A note on Mags: she is not a real person. She is a composite character built from documented government IT leadership concerns, published federal and state AI governance challenges, and the anxiety patterns that surface when agencies move from AI planning to AI deployment. The concerns she raises, the advisories she references, and the institutional pressures she describes are all real. She is not. Her opinions about the vending machine on the fourth floor are entirely fictional.

You mentioned before we started recording that you've been "reading too much." What does that mean?

Mags: It means I have four CIS advisories, two GAO reports, and an OWASP document I printed out (printed out, like it's 2004) all sitting on my desk next to a governor's office memo. I wrote a 38-page AI governance plan last fall. Thirty-eight pages. And now I have agents actually running in production, and every morning I wake up thinking about a different way they could go wrong.

Which way this morning?

Mags: Hallucination. Because of that South Africa situation. A government published an entire national AI policy document, and at least six sources in the bibliography were completely fabricated. Fictitious journals. Then Deloitte got caught with fake references in a report for the Australian government.¹ These aren't hypotheticals anymore. These are governments that had to publicly withdraw documents because their AI invented citations.

I have an agent processing benefits eligibility. If that agent hallucinates a citizenship status or invents a policy reference, that's a person who doesn't get their medication. A real human being sitting in a waiting room wondering why they got denied.

That sounds like a model quality problem, though. Is your identity vendor going to fix hallucination?

Mags: No. And I'm not confused about that. The hallucination itself, the RAG tuning, the confidence scoring, the output validation, that's my AI vendor's problem and my model engineering team's problem. Full stop.

But here's where I lose sleep. Say my benefits agent hallucinates and generates a bogus data modification request based on a made-up output. What limits the scope of that action? OWASP actually names this. They call it Excessive Agency, and hallucination is listed as one of the causes of excessive agency damage, right alongside prompt injection and malicious plugins.² The fix they recommend isn't better hallucination detection. It's making sure the agent's identity permissions physically prevent it from acting beyond its authorized scope.

A benefits-processing agent that can only write to the specific dataset it's authorized to touch can't corrupt the entire eligibility database even if it hallucinates wildly. That's containment. The model might be wrong, but the damage stays in a box.

You jumped from hallucination to identity permissions pretty fast. Most people don't make that turn.

Mags: (laughs) That's because I spent three months writing a governance plan that forced me to think about every risk category as one giant interconnected disaster. The governor's executive order was modeled on OMB guidance, and every agency had to address AI risk comprehensively.³ So I wrote sections on bias, transparency, security, accountability. And now my brain treats all of it as one undifferentiated threat blob.

Which is both a strength and a problem, as you're about to find out.

Let's find out. Your agency processes benefits, so fairness has to be near the top of the list.

Mags: Top three, easily. And look, algorithmic bias lives in the model and the training data. I'm not going to sit here and pretend that's an identity governance problem. If my model was trained on incomplete demographic data and it systematically miscategorizes applicants, that's a fairness audit problem. That's model governance.

But here's what nobody in the AI fairness conversation seems to talk about. When a bias incident surfaces, and they surface late, sometimes years after deployment,⁴ the first question from the inspector general is never "why is your model biased?" The first question is: "Show me exactly what this agent did, which citizen records it touched, and who authorized it to make these decisions."

That's an accountability question. That's an audit trail question. The bias belongs to the model team. The investigation lands on my desk. And if I can't produce a clean identity-level record of every action that agent took, I'm the one explaining to the legislature why we can't reconstruct what happened.

What about observability? I hear that word constantly from government IT leaders.

Mags: And it means two completely different things depending on who's saying it. My data science team means model observability: replaying the reasoning path, understanding why the model produced a specific output. That's MLOps tooling. Arize, WhyLabs, that world.

When I say observability, I mean something much more boring and much more important. Can I tell a state auditor exactly what this agent accessed, when it accessed it, under whose authority, and with what scope?⁵ Every agent API call tied to a specific identity, a delegating user, a granted scope, and a timestamp.⁶

My data science team wants to understand why the model thought something. I need to prove what it did and who let it. Those are different questions answered by different systems, and conflating them has cost me about six weeks of confused vendor conversations this year alone.

Your security team flagged the CIS April advisory on prompt injection. How worried are you?

Mags: Very worried, then less worried, then worried in a different way.

Very worried because CIS specifically warned that prompt injection attacks are a serious and growing threat: hidden instructions in documents, emails, websites that AI tools ingest.⁷ My agents process citizen-submitted documents. The attack surface is basically the entire intake pipeline.

Less worried because I've stopped treating prompt injection as a standalone catastrophe. The CIS advisory itself points to the same structural backstop: least-privilege identity controls as the containment layer when model-layer defenses fail.⁷ If an agent gets injected and follows a malicious instruction, the blast radius of that instruction is bounded by the agent's identity permissions. An agent scoped to read-only access on a specific dataset cannot exfiltrate a directory even if injected, because its identity physically prevents the action.

Worried in a different way because I realized I'd been having the same epiphany four times without noticing.

Say more about that.

Mags: (long pause)

“

I've been treating hallucination, bias, observability, and prompt injection as four separate fires. Four separate budget line items, four separate vendor conversations, four separate sections of my governance plan. But OWASP is explicit: hallucination, prompt injection, malicious plugins, poorly-performing models are all different causes of the same class of harm, which is an agent that does more than it should.² And the structural mitigation for that entire class of harm is least-privilege identity governance. One control that limits the blast radius regardless of which failure mode triggers it.

I spent three months being so comprehensive that I missed the convergence. The compliance process actually made me worse at understanding my own risk posture. I was so busy writing separate sections that I couldn't see they all shared a foundation.

That seems like it should have been obvious from the start.

Mags: Everything is obvious after someone else says it out loud. That's what "obvious" means. Try writing a 38-page governance plan under a deadline from the governor's office and see how much structural clarity you achieve. I was writing to survive an audit, not to understand my own architecture.

What would you tell a vendor walking into your office right now?

Mags: Don't tell me my four concerns are wrong. Don't dismiss any of them. Every single one is real and I will remember if you wave them away.

“

But help me organize them into two clean buckets: model-layer risks where identity is a downstream containment control, and identity-layer risks where identity is the primary control. Then show me how the blast radius argument makes the first bucket less catastrophic.

That's all I want. Someone who takes my concerns seriously and then helps me see the structure underneath them. Instead of whatever that last vendor did, which was basically a 45-minute demo of features I didn't ask about while I sat there thinking about my governance plan.

Last question. The vending machine on the fourth floor.

Mags: Fictional. Like me. But if it were real, it would be out of Diet Coke, because every vending machine in every government building in America is out of Diet Coke. That's not a hypothesis. That's infrastructure.

Marguerite "Mags" Tremblay is a composite character. The concerns, advisories, governance frameworks, and institutional pressures described in this interview are documented and sourced. The person is not real. The Diet Coke situation is universal.

Rest of World, "Fact-check fail: When AI hallucinations derailing governments," May 2026. https://restofworld.org/2026/government-ai-hallucinations-south-africa-deloitte/ ↩
OWASP Gen AI Security Project, "LLM08: Excessive Agency," official LLM Top 10. https://genai.owasp.org/llmrisk2023-24/llm08-excessive-agency/ ↩ ↩²
Federation of American Scientists, "Who Governs Government AI?" March 2026, citing OMB M-25-21. https://fas.org/publication/who-governs-government-ai/ ↩
arXiv, "Fairness in AI and Its Long-Term Implications on Society," 2023. https://arxiv.org/pdf/2304.09826 ↩
Ian Loe, "Your AI Agent Needs an Audit Trail, Not Just a Guardrail," Medium, March 2026. https://ianloe.medium.com/your-ai-agent-needs-an-audit-trail-not-just-a-guardrail-6a41de67ae75 ↩
Tyk, "AI agent API governance: Auth, audit trails and zero trust," May 2026. https://tyk.io/learning-center/ai-agent-api-governance-auth-audit-trails-and-zero-trust/ ↩
Center for Internet Security, "New CIS Report Warns Prompt Injection Attacks Pose Growing Risk to Generative AI," April 2026. https://www.cisecurity.org/about-us/media/press-release/new-cis-report-warns-prompt-injection-attacks-pose-growing-risk-to-generative-ai ↩ ↩²

You mentioned before we started recording that you've been "reading too much." What does that mean?

Which way this morning?

That sounds like a model quality problem, though. Is your identity vendor going to fix hallucination?

You jumped from hallucination to identity permissions pretty fast. Most people don't make that turn.

Which is both a strength and a problem, as you're about to find out.

Let's find out. Your agency processes benefits, so fairness has to be near the top of the list.

What about observability? I hear that word constantly from government IT leaders.

Your security team flagged the CIS April advisory on prompt injection. How worried are you?

Mags: Very worried, then less worried, then worried in a different way.

Worried in a different way because I realized I'd been having the same epiphany four times without noticing.

Say more about that.

Mags: (long pause)

“

That seems like it should have been obvious from the start.

What would you tell a vendor walking into your office right now?

Mags: Don't tell me my four concerns are wrong. Don't dismiss any of them. Every single one is real and I will remember if you wave them away.

“

Last question. The vending machine on the fourth floor.

Rest of World, "Fact-check fail: When AI hallucinations derailing governments," May 2026. https://restofworld.org/2026/government-ai-hallucinations-south-africa-deloitte/ ↩
OWASP Gen AI Security Project, "LLM08: Excessive Agency," official LLM Top 10. https://genai.owasp.org/llmrisk2023-24/llm08-excessive-agency/ ↩ ↩²
Federation of American Scientists, "Who Governs Government AI?" March 2026, citing OMB M-25-21. https://fas.org/publication/who-governs-government-ai/ ↩
arXiv, "Fairness in AI and Its Long-Term Implications on Society," 2023. https://arxiv.org/pdf/2304.09826 ↩
Ian Loe, "Your AI Agent Needs an Audit Trail, Not Just a Guardrail," Medium, March 2026. https://ianloe.medium.com/your-ai-agent-needs-an-audit-trail-not-just-a-guardrail-6a41de67ae75 ↩
Tyk, "AI agent API governance: Auth, audit trails and zero trust," May 2026. https://tyk.io/learning-center/ai-agent-api-governance-auth-audit-trails-and-zero-trust/ ↩
Center for Internet Security, "New CIS Report Warns Prompt Injection Attacks Pose Growing Risk to Generative AI," April 2026. https://www.cisecurity.org/about-us/media/press-release/new-cis-report-warns-prompt-injection-attacks-pose-growing-risk-to-generative-ai ↩ ↩²

Four Fires, One Sprinkler System — A (Fictional) Government IT Director Discovers All Her AI Nightmares Share an Address

You mentioned before we started recording that you've been "reading too much." What does that mean?

Which way this morning?

That sounds like a model quality problem, though. Is your identity vendor going to fix hallucination?

You jumped from hallucination to identity permissions pretty fast. Most people don't make that turn.

Let's find out. Your agency processes benefits, so fairness has to be near the top of the list.

What about observability? I hear that word constantly from government IT leaders.

Your security team flagged the CIS April advisory on prompt injection. How worried are you?

Say more about that.

That seems like it should have been obvious from the start.

What would you tell a vendor walking into your office right now?

Last question. The vending machine on the fourth floor.

Footnotes

You mentioned before we started recording that you've been "reading too much." What does that mean?

Which way this morning?

That sounds like a model quality problem, though. Is your identity vendor going to fix hallucination?

You jumped from hallucination to identity permissions pretty fast. Most people don't make that turn.

Let's find out. Your agency processes benefits, so fairness has to be near the top of the list.

What about observability? I hear that word constantly from government IT leaders.

Your security team flagged the CIS April advisory on prompt injection. How worried are you?

Say more about that.

That seems like it should have been obvious from the start.

What would you tell a vendor walking into your office right now?

Last question. The vending machine on the fourth floor.

Footnotes