Working in the Mud - The Mental Model That Keeps Engineering Teams Moving
Every engineering blog paints a picture of clean microservices, continuous deployment, and comprehensive observability. I've been in this industry for over a decade, and I've never experienced this ideal state across the board. I've seen glimmers. Teams that nail one dimension. But never everything at once.
That gap between the ideal and reality is what I call working in the mud.
The Concept
Working in the mud means knowing what the clean state looks like while accepting that you're not there. The key word is accepting. Not resigning yourself to it. Not pretending it's fine. Accepting the current reality so you can actually move forward.
Teams get stuck trying to do things the "right way." They hit analysis paralysis. Decision deadlock. They refuse to make necessary compromises because in an ideal world, those compromises shouldn't exist. But we don't live in an ideal world. We live in the mud.
You can either acknowledge where you are and make progress, or you can insist on perfection and stay stuck. I choose progress.
What This Is Not
I want to be clear about something. Working in the mud is not broken window syndrome.
Broken window syndrome is when you accept that things are bad and stop caring. A test fails intermittently and nobody investigates. A dashboard has outdated metrics that everyone ignores. Technical debt accumulates because "that's just how it is here."
Working in the mud is the opposite. You recognize your shortcomings honestly. You make pragmatic decisions in the short term. But you never stop working toward the clean state. The mud is temporary positioning, not permanent residence.
The distinction matters because it determines whether you're making strategic compromises or just giving up.
Example: Deployment Freezes
Charity Majors wrote a famous article about not deploying on Fridays. In an ideal world, she's right. If you have robust CI/CD, comprehensive tests, and observability that catches issues in canary deployments before customers notice, there's no reason to avoid Friday deploys. Your safety nets handle the risk.
But most teams don't have those safety nets fully built. Google's DORA research shows the gap clearly: elite teams recover from failed deployments in less than one hour, while low performers take anywhere from one month to six months. That difference comes down to the investment in automation, testing, and observability that most teams haven't yet made.
If your testing is spotty, your rollback process is manual, and your monitoring has gaps, deploying on a Friday afternoon before a long weekend is irresponsible. You're creating risk that you can't manage.
Is a deployment freeze a crutch? Absolutely. It's compensating for deficiencies elsewhere in the system. The clean solution is to fix those deficiencies so you don't need the freeze.
But until you do that work, the freeze is the pragmatic call. It's working in the mud. You're not pretending your deployment process is something it isn't. You're acknowledging reality and making a decision that accounts for your actual capabilities.
The danger comes when the freeze becomes permanent, when nobody's working on the underlying issues because "we just don't deploy on Fridays." That's broken windows. Working in the mud means you keep the freeze while actively investing in the testing, observability, and deployment automation that will eventually make it unnecessary. Your deployment strategy is ultimately constrained by foundational architectural decisions you've made, which is why that investment matters.
Example: Story Points and Sprint Ceremonies
Agile ceremonies are overhead. Time spent planning and estimating is time not spent building. In a perfectly aligned team with shared understanding of platform, product, and users, much of this ceremony becomes unnecessary.
I've never worked on that team.
Real teams have gaps: new engineers unfamiliar with the codebase, product managers still learning the customer, undocumented architectural decisions. Story pointing forces communication about these gaps. When you estimate a ticket at 8 points and I think it's a 2, we discover fundamentally different assumptions before they become problems.
Is this overhead? Yes. But wishing your team was mature doesn't make it so. Research compiled by CA Technologies found that teams holding regular sprint retrospectives demonstrate 24% more responsiveness and 42% higher quality than teams that skip them.
The mud here is recognizing that your team needs these forcing functions right now. The path out is building shared understanding over time so ceremonies become lighter. Skipping them prematurely just creates chaos.
Example: Management Overhead for Junior Teams
Some teams need significantly more management time than others. A team of senior engineers with a decade of context can operate with minimal oversight. A team with skill gaps, varying experience levels, or members new to the domain needs more.
This isn't a judgment. It's just reality.
The idealistic approach is to hire only senior engineers and let them loose. But budgets constrain hiring. Available talent varies. Growing people internally is often more sustainable than constantly recruiting. These are legitimate constraints.
Working in the mud means acknowledging that your team composition requires more hands-on leadership right now. More one-on-ones. More code review feedback. More explicit context-sharing. Different team maturity levels need different leadership approaches, which is why understanding your engineers' working styles becomes critical. This takes a significant portion of lead time that could theoretically be spent on architecture or coding.
But if you don't invest that time, the team suffers. Product quality drops. Velocity decreases. Morale tanks. The "efficient" approach of minimizing management overhead actually creates more problems than it solves given the current team composition.
The path out is deliberate investment in leveling people up. Building documentation. Creating feedback loops. Over time, the team develops the capability to operate with less oversight. But you can't skip the investment phase by pretending your team is already there.
The Practical Framework
Working in the mud comes down to three questions:
Where are we actually? Not where we wish we were. Not where we think we should be. Where are we right now? What are our real capabilities and constraints? Answer this by auditing recent failures and near misses. They reveal your actual state more honestly than any planning document.
What's the pragmatic path forward? Given our actual situation, what decisions let us make progress without pretending our constraints don't exist? List your constraints explicitly, then design around them rather than ignoring them.
How do we get out of this mud? What investments will move us toward the clean state over time? What's the timeline and what milestones will we hit along the way? Pick one area of mud to address per quarter, and track progress with concrete metrics.
The first question requires honesty. The second requires pragmatism. The third requires discipline to not accept the mud as permanent.
Varying Levels of Mud
Here's the nuance. You're never entirely in the mud or entirely out of it. Different aspects of your business, team, and product exist at different maturity levels.
Your observability might be world-class while your testing is weak. Your incident response could be excellent while your requirements process is chaos. Your deployment pipeline might be rock solid while your team scaling approach is ad hoc.
This is normal. Resources are finite. You can't optimize everything simultaneously.
Working in the mud means being clear-eyed about where you're strong and where you're not. It means making explicit tradeoffs about where to invest limited improvement effort. It means not letting perfection in one area blind you to mud in another. This is where the four types of engineering leadership matter most, helping you decide which of people management, project coordination, developer experience, and platform architecture needs your focus right now.
Why This Matters
Teams that can't work in the mud don't ship. They spend months designing perfect architectures for problems they don't fully understand yet. They debate deployment strategies while competitors take market share. They write comprehensive documentation for systems that change weekly.
Teams that embrace broken windows ship garbage. They accumulate technical debt until velocity grinds to zero. They burn out engineers with constant firefighting. They surprise customers with preventable outages. This is why investigating upstream root causes of chronic issues matters so much. Many production fires trace back to architecture and requirements decisions made long before they explode.
The middle path is working in the mud. Shipping imperfect solutions while tracking the imperfections. Making compromises while documenting what you're compromising on. Moving forward while maintaining a clear view of what "good" looks like.
This isn't glamorous. There's no conference talk about "how we accepted mediocrity in our CI pipeline while we focused on product market fit." But it's how real engineering teams operate.
The mud is where progress happens.

