The Async Decoupling Pattern for Scalable Batch Processing

Feb 10

Batch processing architecture has a clean pattern that scales elegantly: decouple batch systems asynchronously from everything else. When you get this right, your real-time system stays stable regardless of batch volume, and you never need elaborate job scheduling to avoid infrastructure strain.

Why Async Decoupling Matters

Consider a scenario: you have 100 ML jobs that spin up daily. They all finish around the same time, and each one tries to call back into your real-time system synchronously.

Without proper decoupling, your real-time system gets overwhelmed. The queue backs up. Latency spikes. Maybe something crashes. Those latency spikes matter more than teams realize: Amazon found that every 100ms of latency costs roughly 1% in sales.

One workaround is staggering jobs: run customer A at 7:00 AM, customer B at 7:05, customer C at 7:10. But now you have a scheduling problem layered on top of your architecture. You're treating symptoms, not causes. This mirrors a pattern I see across organizations: operational pain usually traces back to upstream architectural decisions, not the symptoms you're fighting.

The better approach is building proper queuing into your batch architecture from the start. When batch systems are asynchronously decoupled from the rest of your infrastructure, batch workload spikes stay isolated.

What Batch Processing Actually Requires

This pattern assumes your work units are truly independent. If they're not, the first step is refactoring to make them so.

A proper batch system has four components:

Fan-out at the top. Something triggers the batch run: a cron job, a schedule, or an event. That trigger calls a single service whose only job is creating work payloads. It runs a few for loops, pulls all customers, pulls all job types, generates a payload for each permutation, and shoves them into a queue.
Independent, consistently sized work units. Each payload should be small and independent. Customer A's daily classification has nothing to do with Customer B's weekly segmentation. They can run in parallel without coordination.
Elastic consumers. Scale up workers to consume from the queue. They pull work, process it, and when the queue empties, they scale down. The queue absorbs the burst, not your infrastructure. Modern cloud platforms make this straightforward: AWS Lambda with SQS starts at 5 concurrent invocations and can scale up by 300 per minute to a maximum of 1,250 concurrent executions per event source mapping.
Async decoupling at the bottom. When jobs finish, they don't call back synchronously into your real-time system. They drop results onto another queue. A separate consumer process pulls from that results queue and feeds them into the rest of your system at whatever pace it can handle. This requires idempotent consumers since queue delivery is at-least-once, not exactly-once.

The real-time system never sees a burst. Results pool up, the consumer chews through them, and everything flows smoothly. If results sit in the queue for an extra 10 minutes, who cares? The batch job took 30 minutes anyway. Another 10 is noise. This pattern delivers real results: one retail platform reduced checkout latency by 40% by offloading inventory updates to asynchronous processing.

The For Loops Principle

This is where teams leave performance on the table. The more dimensions of independence your work has, the more parallelism you can achieve.

Say your batch jobs have three dimensions: customer, job type (classification, segmentation, use case analysis), and time period (daily, weekly, monthly). If all three dimensions are independent (and they usually are), you can have three nested for loops:

for each customer:
    for each job_type:
        for each period:
            create_payload(customer, job_type, period)

Instead of 10 payloads (one per customer), you might have 10 customers × 3 job types × 3 periods = 90 payloads. Smaller work units. More parallelism. Better queue throughput. And horizontal scaling is linear: doubling queue consumers doubles throughput, making capacity planning predictable.

The key constraint is that the work must be independent. Customer A's daily classification cannot depend on Customer B's weekly segmentation. If there's coupling, you can't nest those loops. Your batch system's parallelism ceiling is determined by how many dimensions of independence exist in your workload.

More independence means more for loops. More for loops means more payloads. More payloads means smaller, faster work units that can scale horizontally.

The Ideal State

When batch processing is done right, your real-time system runs 24/7 with roughly the same capacity. Sensors send the same data every minute. Work coming in is consistent. Work going through is consistent.

Batch is different. It's massive for a block of time, then quiet. You fire everything off at 7:00 AM, the queue fills up instantly, you see message lag as consumers catch up, and then it drains. No staggering required. No elaborate scheduling. No overwhelming your synchronous systems.

The queue absorbs the burst. The async consumer on the output side absorbs the results burst. The real-time system never knows batch ran.

This design pattern aligns with broader architectural thinking about how to structure systems for scale. Thinking about your architecture like a building helps surface these trade-offs clearly: you're deciding whether to build weak foundations and expensive application workarounds, or strong infrastructure that other layers can trust.

Implementation Checklist

When designing or refactoring a batch system, recognize this is infrastructure-level thinking, not feature work. The patterns here affect how your entire organization scales. Understanding where a problem sits in your architecture helps gauge what kind of thinking it requires.

Map your work dimensions. What are the independent axes? Customers, job types, time periods, regions, tenants? Each independent dimension can become a for loop.
Verify independence. Can customer A's job run without waiting for customer B's job? Can the daily run happen independent of the weekly run? If yes, they can be separate payloads.
Add input queuing. The cron trigger should do nothing except create payloads and push them to a queue. Keep this service simple. It's a fan-out mechanism, not a processing engine.
Scale consumers elastically. Workers should pull from the queue, process one payload, complete, and repeat. Autoscale based on queue depth.
Make the output async. Results go to a results queue, not to a synchronous endpoint. A separate consumer drains results into the rest of your system at its own pace.
Measure queue lag, not job timing. Success isn't "all jobs completed by 8:00 AM." Success is "queue drained, results propagated, no downstream impact."
Monitor queue health. Track queue depth, consumer lag, and message age. Alert on growing backlogs before they become incidents.

The Payoff

A properly decoupled batch system gives you predictable infrastructure costs, eliminates job staggering complexity, enables horizontal scalability without rearchitecture, and isolates the blast radius when something fails.

Your real-time system stays stable. Your batch system scales with your business. And when you add a new customer, you add one more payload to the queue, not a new staggered cron job at 7:47 AM.

Stop staggering jobs. Build queues.

Related Content

Featured

Feb 17, 2026

The Three Pillars of Scalable Data Processing

Feb 17, 2026

Every unit of work in a data processing system should aspire to be small, independently processable, and consistently sized. When these three properties hold, scaling becomes almost trivially simple. Reality rarely cooperates, which is why understanding these properties matters so much for platform engineering.

Feb 17, 2026

Feb 10, 2026

The Async Decoupling Pattern for Scalable Batch Processing

Feb 10, 2026

Feb 3, 2026

Batch and Real-Time Platforms Have Different Jobs

Feb 3, 2026

When designing data platforms, I frequently encounter teams trying to build one unified system that handles both real-time streaming and batch analytics. The instinct makes sense: both workloads operate on the same underlying data, so why not share infrastructure?

Getting this architecture right has real consequences.

The challenge is that these workloads have fundamentally different characteristics. Supporting both well on a single platform is expensive and complex. In most cases, you get better results by separating them early and letting each system lean into its strengths.

Feb 3, 2026

Jan 28, 2026

Making Interviews Objective with AI (Without Making Them Worse)

Jan 28, 2026

Everyone has opinions about candidates. That's the problem.

We're supposed to ask standard questions, evaluate people against the job description, and test whether they can do the work. Instead, we dig into areas where we think they're weak, ask different questions for each person, and end up testing our biases instead of their abilities.

Jan 28, 2026

Jan 20, 2026

The Software That Shouldn't Exist

Jan 20, 2026

Everyone's worried about AI replacing engineers. The more interesting question is what happens when the cost of building software drops so dramatically that entirely new categories of software become viable.

The industry is calling this "personalized software." Custom tools built for a specific person, a specific context, a specific moment. Software that never leaves your machine. Software that would never justify a product. Software that, until recently, simply wouldn't exist.

Jan 20, 2026

Jan 6, 2026

Shifting Left - How Small Teams Handle Organizational Gaps Without Breaking

Jan 6, 2026

Every small organization has gaps. Maybe you have an engineering lead but no dedicated DevOps team. Maybe your product manager is stretched thin and the tech lead is absorbing PM responsibilities. Maybe a designer role is emerging, but nobody owns it yet.

These gaps often emerge in specific domains. Growing organizations typically need four types of engineering leadership, and early-stage teams almost never have all of them covered. This is normal. The question is: how do you respond?

Teams may make the mistake of dumping the entire burden on one person. They identify the gap, find whoever is closest to it, and expect that person to absorb all the additional work. This breaks people.

There's a better approach I call "shifting left."

Jan 6, 2026

Dec 30, 2025

Working in the Mud - The Mental Model That Keeps Engineering Teams Moving

Dec 30, 2025

Every engineering blog paints a picture of clean microservices, continuous deployment, and comprehensive observability. I've been in this industry for over a decade, and I've never experienced this ideal state across the board. I've seen glimmers. Teams that nail one dimension. But never everything at once.

That gap between the ideal and reality is what I call working in the mud.

Dec 30, 2025

Dec 23, 2025

Software Architecture Is a Building - A Mental Model for Technical Decisions

Dec 23, 2025

Most architecture discussions devolve into abstract debates about microservices, monoliths, and database choices. After years of explaining these concepts to engineers and product leaders, I've found that thinking about software architecture like a physical building cuts through the noise and makes the tradeoffs viscerally clear.

This isn't just a teaching metaphor. It's a decision framework that surfaces why some changes cost weeks and others cost months, why certain tech debt compounds silently while other debt screams at you daily, and how to gauge the right amount of architectural runway to build.

Dec 23, 2025

Dec 16, 2025

AI-Assisted Development Changes What Matters in Framework Selection

Dec 16, 2025

The two-minute deploy is killing my productivity.

That sounds wrong until you think about proportions. Two minutes is nothing. But when AI-assisted development shrinks the time spent writing code, those two-minute deploys start consuming a much larger percentage of your development cycle.

I discovered this while building with a managed backend framework that requires redeployment even during local sandbox development. The frontend rebuilds in seconds. The backend takes two minutes. Suddenly, that backend deploy time is where I spend most of my dev cycle waiting.

A caveat before going further: this observation comes from a greenfield project where I'm moving quickly and iterating frequently. AI-assisted development changes the structure of work in existing projects too, but this effect is most pronounced when building something new and small, where rapid iteration is the default.

Dec 16, 2025

Dec 9, 2025

Stop Fighting the Wrong Battles - The Three-Level Problem Framework

Dec 9, 2025

Most engineering teams waste weeks solving the wrong problems.

They polish user interfaces while core APIs fail. They optimize conversion funnels while databases crash. They redesign onboarding flows while authentication randomly breaks.

This happens because everything gets labeled "high priority" without any systematic way to determine what actually needs fixing first.

Here's a three-level framework that immediately clarifies what to fix first, what can wait, and what's wasting everyone's time.

Dec 9, 2025

engineeringarchitecture

Brian Conn https://connsulting.io

The Async Decoupling Pattern for Scalable Batch Processing

Why Async Decoupling Matters

What Batch Processing Actually Requires

The For Loops Principle

The Ideal State

Implementation Checklist

The Payoff

Related Content

Connsulting

About

Offerings