The Intent Gap

Your AI-generated code is degrading, and the degradation isn't a tooling problem. It's a translation problem, and every step of the chain is lossy.

I've fought against this degradation. I swap models. I rerun implementation prompts. The damage happens before the first line of code gets generated, in a chain of translations that no refactoring pass can reverse. This is a different problem than the one I wrote about in Risk Evaluation in the Age of AI-Aided Development, which is about deciding when AI acceleration is worth the technical debt. The Intent Gap is upstream of that decision.

I call the thing at the center of this the Intent Gap: the distance between what you meant and what the AI produced. The Gap is where everything fails. And once you see it, you can't unsee it.

The Lossy Translation Chain

Think about what actually happens when a human intent turns into AI-generated code. Someone has an idea. That idea gets compressed into a JIRA ticket, which is rarely more than a paragraph. The ticket gets dropped into a prompt, or fed to an agent, or expanded into a PRD that nobody reviews carefully because the AI can "just figure it out." The agent produces code. The code gets merged, maybe without anyone reading it closely, because reviewing AI output at volume is tedious and the reviewer has their own work to do.

Every step in that chain is lossy. Every step strips context that the next step has to guess at.

It's a screenshot of a screenshot. You can still see the shape of the original, but the resolution keeps dropping. And the artifact everyone references later, the JIRA ticket, is the lowest-resolution version of the whole chain. It becomes the de facto history of the decision, because the git log only shows the output, and the original reasoning lives in someone's head or a Slack thread that got archived.

The code looks fine. The build is green. Tests pass. And the actual intent, the reason any of it exists, is nowhere in the system. It's been discarded at every hop.

Why the Usual Fixes Don't Work

The instinct, when you notice the degradation, is to refactor. Run another AI pass and clean it up. Add another evaluation step. Feed the output back through a different model. This is the same move, one more time.

Refactoring AI-generated code with AI is another lossy pass. You're not recovering the original intent. You're running the degraded artifact through the translation chain again, hoping the noise averages out. It doesn't. It compounds.

I've seen this play out in practice. Teams generate code, notice it's drifting from what they actually wanted, and respond by generating more code on top. The first pass was a rough translation of a vague ticket. The second pass is a refactor of the first translation, working from the code itself as the only available context. The third pass is a cleanup of the refactor. By the time anyone looks at the result, the connection to the original intent is a rumor.

Martin Fowler recently endorsed Margaret-Anne Storey's Triple Debt Model, which proposes three categories of debt in AI-augmented development: Technical Debt in the code, Cognitive Debt in the people, and Intent Debt in the artifacts. Storey's paper makes the paradox explicit: "Generative AI may reduce technical debt while simultaneously accelerating cognitive and intent debt."

That's the dynamic I'm describing. The code gets cleaner. The intent gets harder to recover. The refactor pass reduces Technical Debt by a measurable amount and adds Intent Debt in the same motion. You're moving debt from a column your tools can measure into a column they can't.

Gap and Debt Are Not the Same Thing

Storey's framework is diagnostic. It names the condition. Intent Debt is what accumulates when intent goes unmanaged across teams and time. It's the balance sheet version.

The Intent Gap is the unit that accumulates into Debt. It's individual and it's happening in real time, every time someone hands a vague ticket to an agent and accepts whatever comes back.

This distinction matters because Gap is actionable and Debt is retrospective. You can close a gap in the moment, by forcing the intent to be explicit before the generation starts. You can only diagnose debt after the fact, by auditing what's already in your codebase and trying to reconstruct why it's there. Gap is where the leverage is.

It also matters because the mechanism that creates the Gap has a name. Shaw and Nave's research on Cognitive Surrender found that AI tools "inflate confidence even when AI is wrong." The person with the original intent stops verifying because the output feels authoritative. They become an Armchair Architect, approving artifacts they haven't actually checked against what they meant. The Gap widens because the human on the other side quietly stopped holding the line.

The Old Practices Are Coming Back

Marshall McLuhan had a concept called retrieval: new media don't just create new practices, they make old practices viable again in new forms. I think that's exactly what's happening with the software development lifecycle right now.

Spec-driven development. Design by contract. Formal verification. All of these were considered too expensive for most teams, because the cost of writing and maintaining the spec exceeded the cost of just writing the code. When the cost of code generation drops to near zero, that equation inverts. The spec becomes the expensive, valuable artifact. The code becomes the cheap build output. I've argued elsewhere that teams need a clearer map of where AI can actually carry weight versus where it can't; The AI Adoption Ladder was my first pass at that map. The Intent Gap is what determines how high up the ladder you can safely climb.

This isn't speculation. Martin Fowler and ThoughtWorks hosted a Future of Software Development retreat in February 2026 where the Triple Debt Model got formalized and practitioners named something they're calling the Middle Loop: a new category of supervisory engineering work sitting between the inner loop of writing code and the outer loop of deploying it. That's exactly the space where Intent Gap management has to live. The people closest to this shift are all circling the same structural problem.

The old practices aren't coming back because they're traditional. They're coming back because the economics that killed them have flipped.

Solving the Gap Requires a Different Foundation

Here's the part I'm not going to resolve in this post, because resolving it requires a fundamentally different relationship between specs and code than most teams currently have.

If you want to close the Intent Gap, the spec has to become the source of truth and the code has to become a build artifact. Not the other way around. That means the intent gets captured in a machine-readable form, the agent generates code from that spec, and the verification layer holds deterministic gates the agent can't weaken, reinterpret, or route around.

That's a big swing. It changes how PRDs work, how code review works, how git history works, and how accountability for intent distributes across a team. I'm building toward that shape, and I'll write about the mechanics of it when I have more to show.

For now, the useful thing is to see the Gap clearly, because seeing it changes what you do next.

What to Do Tomorrow

Pick one AI-generated pull request that shipped in the last week. Open the JIRA ticket that kicked it off. Read the original ask. Then read the merged code.

Write down, in one sentence, the difference between what was asked for and what got shipped. Not the bugs. Not the style issues. The semantic difference between the intent and the artifact.

That's your Intent Gap for that change. It's probably larger than you expected, and it's sitting in your main branch right now, quietly becoming the system of record for a decision nobody is checking.

You can't refactor your way out of it. You can only close it at the source.


Related Content

Next
Next

Tests as Ceremony: When AI Breaks the Safety Net