Brian Conn Brian Conn

Two-Phase War Games - Scaling Incident Response Training Across Multiple Teams

Traditional war games fall apart when multiple teams try to learn incident response and team dynamics simultaneously. This two-phase approach separates the challenges: homogeneous sessions build incident response skills within existing teams, while heterogeneous sessions focus on cross-team coordination with a shared foundation.

Read More
operations, production Brian Conn operations, production Brian Conn

The 5 Stages of a Production Incident

Here’s a bit of a paradox: the better you are at solving SaaS production incidents, the harder each incident is to solve.

At first glance, this doesn’t make a lot of sense. Wouldn’t being better make solving production incidents easier? No. The trick is that once you get good at production incidents, you don’t get hit with the easy ones anymore: you solve them for good. That leaves only the new and challenging problems for you to solve. The average incident is more complex, but your reward is that the frequency of incidents goes way down.

I’d take that trade any day.

Read More