👋 Hey friends,
Here’s what I’ve got in store for you this week:
Another snippet from The Backend Lowdown
Recent post recap: DB Load / AAS charts
Next post preview: Write contention
Let’s get into it 👇
Every newsletter will include a snippet from my book in progress, The Backend Lowdown, available for $5 right now on Gumroad!
Get The Backend Lowdown →Observing and Deploying Migrations Safely (cont.)
Watch for Lock Contention
Before running a migration - especially on large or high-traffic tables - it's important to check whether your database is already under lock pressure, also known as lock contention.
Lock contention happens when multiple database operations are trying to acquire conflicting locks on the same resource (like a table or row), and some of them have to wait. This can slow down queries, cause timeouts, or block your migration entirely.
If your migration needs to lock a table - even briefly - and another long-running query is already holding a conflicting lock, your migration will just sit there waiting. Worse, if other queries pile up behind it, you can get a cascading slowdown or even an outage.
That's why it's good practice to check for:
Long-running queries - may be holding locks that your migration will conflict with
Autovacuum activity - can hold lightweight locks, especially if it's vacuuming a large or bloated table
Active transactions - especially those that have been open for a long time
You can check for ungranted (i.e. waiting) locks with:
SELECT * FROM pg_locks WHERE NOT granted;Or tools like:
pg_stat_activity- to see current queries and lock wait statusYour APM (New Relic, Datadog, etc.) - to catch rising DB latency or blocked transactions
App-level logs (e.g. Rails + Lograge) - to see if requests are slowing down or queueing
If you see signs of contention, it's best to wait, monitor, or reschedule your migration for a lower-traffic time - like off-peak hours or a dedicated maintenance window.
Monitor for Deployment Pipeline Failures
If your migration runs inside a CI/CD pipeline step, and that migration blocks for too long (e.g. on a lock), your entire deploy can get stuck - potentially leading to:
Autoscaling delays
Worker exhaustion
Connection pool saturation
Always have a way to:
Cancel a stuck deploy
Apply schema changes out-of-band (e.g. via a direct console,
heroku run, etc.)
Recent Post Recap
In case you missed it, this week I posted about one of the most useful charts I’ve found for diagnosing database performance issues:
DB Wait / Average Active Sessions (AAS)
This chart shows how many sessions are active and what they’re doing (CPU) or waiting on. Most APMs break it down into categories like:
CPU - the session is doing real work
I/O - the session is waiting on storage/network I/O
Locks - the session is blocked waiting on a lock

The primary visual for the video, though it looks a bit different in real APMs
Even though the chart will look a little different depending on your tool (CloudWatch, Datadog DBM, etc.), the idea is the same and it’s been instrumental for me in pinpointing real DB problems fast.
Here's the mental model I use:
AAS ≈ how many sessions are “active” at once (running on CPU or stuck waiting on something)
If AAS is consistently above your available CPU, you're building a line (queueing).
The real value is breaking it down by wait category (locks, I/O, CPU, etc.). That breakdown is your hypothesis generator.
Here are 5 ways to interpret the graph:
AAS spike + mostly CPU → compute-bound (check top SQL + query plans)
AAS spike + mostly locks → contention-bound (hot rows/indexes, write patterns)
AAS spike + mostly I/O → storage-bound (scans, cache misses, missing indexes)
AAS flat, latency up → often app-side (connection pool, retries, downstream dependencies)
AAS “sawtooth” after deploys → correlation isn’t proof, but it narrows the search
Next Post Preview: Write Contention
Next I’m going to zoom in on write contention: how to spot it using the same AAS chart, and why it’s so sneaky.
Write contention is when multiple writes compete for the same row(s) (or a small hotspot set), so connections spend more time waiting on each other than doing useful work. In most systems, this shows up as rising latency, even if you didn’t “change much.”
Stay tuned! I’ll show the exact signal to look for and a couple concrete patterns that cause it.
That’s a wrap for this week. If something here made your day smoother, feel free to reply and tell me about it. And if you think a friend or teammate would enjoy this too, I’d be grateful if you shared it with them.
Until next time,
Steven
