👋 Hey friends,

Here’s what I’ve got in store for you this week:

  • Another snippet from The Backend Lowdown

  • Postgres 18 updates that caught my eye

Let’s get into it 👇

The Backend Lowdown: Chapter 1 Preview

Every newsletter will include a snippet from my book in progress, The Backend Lowdown, available for $5 right now on Gumroad!

Get The Backend Lowdown →

The past few months have been snippets from Chapter 4 entirely. Since Chapter 5 is still WIP, I’ve decided to circle back and share a snippet from Chapter 1.

What Makes a Migration Unsafe?

Not all migrations are created equal. Some are quick and harmless. Others can lock your most important tables, block traffic, spike I/O, and even take down production. The difference usually comes down to how much data is affected, when you run the migration, and how much traffic is hitting your database at the time.

A dangerous migration might go unnoticed in staging or a low-traffic app - but the same change in production, under load, can cause outages.

Let's break down the most common causes of unsafe migrations - and why they're risky.

Locking Large Tables

Certain schema changes force the database to acquire an exclusive lock on a table. During that time:

  • Reads and writes are blocked from other sessions,

  • Any queued queries pile up,

  • Application threads start timing out,

  • And eventually, users see errors or your workers hang.

This is how seemingly innocent schema changes can bring down production.

Example:

class AddStatusToUsers < ActiveRecord::Migration[8.0]
  def change
    add_column :users, :status, :string, null: false
  end
end

The rest of the examples will only contain the lines inside the change method for brevity, unless stated otherwise.

This locks the entire users table while it:

  • Adds the column,

  • Applies the NOT NULL constraint,

  • And rewrites every row (because of the constraint).

On a high-traffic table, even a few seconds of blocked writes can trigger a chain reaction. As the migration holds a lock, incoming write queries start to pile up - and each one consumes a database connection while it waits. Before long, your app's connection pool fills up, and any new request that needs a DB connection is forced to wait.

Workers begin to hang as they time out waiting for connections. Requests start to fail. If this happens during a deploy, your health checks may fail too - which can trigger a deploy rollback or leave your app in a partially updated state. All of this can unfold in under a minute if the table is busy enough.

Two Postgres 18 Features I’m Actually Excited About

Postgres 18 ships with a ton of stuff, but two changes feel like they’ll actually affect my day-to-day: better EXPLAIN ANALYZE output and native uuidv7().

1) EXPLAIN ANALYZE grew up

In Postgres 17 and below, you had to remember to tack on BUFFERS if you wanted I/O details:

EXPLAIN (ANALYZE, BUFFERS) SELECT ...;

I can’t tell you how many times I’ve forgotten to add that! 😅 In 18, EXPLAIN ANALYZE includes buffer usage by default, no extra options needed.

Buffers are just Postgres’ 8KB pages. The output tells you things like:

Buffers: shared hit=120 read=40

Here’s a rough mental model:

  • shared hit – pages we already had in memory (cache work / CPU)

  • read – pages we had to fetch from disk (I/O)

So now a single EXPLAIN ANALYZE answers “is this query slow because of disk or CPU?” without a second run.

18 also adds more goodies to EXPLAIN:

  • Index lookup counts per index scan node

  • More detailed memory/disk usage for materialization and window nodes

Put together, this makes your tuning loop simpler:

  1. Run EXPLAIN (ANALYZE) once

  2. If shared read is high and times are skewed to I/O, think indexes, query shape, or storage/tuning

  3. If I/O is low but CPU time is high, think aggregation, sorting, bad join strategy

  4. If an index node shows a ridiculous number of lookups, that’s the join/index to rethink

It’s a small change, but it nudges everyone toward actually looking at I/O instead of just staring at “Actual time”.

2) uuidv7(): UUIDs that don’t trash your B-trees

The other change is native support for UUIDv7, via the new uuidv7() function and friends:

id uuid PRIMARY KEY DEFAULT uuidv7()

You also get:

  • uuidv4() as an alias for gen_random_uuid() (classic random UUIDs)

  • uuid_extract_timestamp(uuid) that now works for v1 and v7, plus uuid_extract_version(uuid) to see which version you’re dealing with

Why should you care? Because UUIDv4 is completely random, which is terrible for a clustered B-tree index: every insert lands on a random page, causing lots of page splits and fragmented indexes over time.

UUIDv7 fixes that by making the high bits time-based and the low bits random:

  • Newer IDs are roughly increasing → inserts mostly append at the end of the index (like a sequence)

  • You still keep enough randomness to avoid collisions and make IDs hard to guess

In practice, that means:

  • Better insert performance and fewer B-tree headaches than v4

  • Natural chronological ordering: ORDER BY id DESC basically gives you “newest first

  • You can even sanity-check creation time from the ID with uuid_extract_timestamp(id)

If you’re on 18 and you like UUIDs for public IDs, uuidv7() as the default PK is a really nice “have your cake and eat it too”: globally unique, API-safe, and finally not hostile to your indexes.

That’s a wrap for this week. If something here made your day smoother, feel free to reply and tell me about it. And if you think a friend or teammate would enjoy this too, I’d be grateful if you shared it with them.

Until next time,
Steven

Keep Reading