stevencodes.swe - October 12, 2025

Personal updates, book snippet

👋 Hey friends,

Here’s what I’ve got in store for you this week:

  • Personal updates

  • A snippet from Chapter 4 of The Backend Lowdown

Let’s get into it 👇

Personal Updates

I’ve been working on an app idea of my own. It’s not something I’m ready to share yet, but I’ll let you know when I think it’s ready. I’ll likely be making some content around it as it’s been fun to think about from a system design / architecture standpoint and has presented some fun and unique challenges. I think it might be fun to build this more in public to see what you all think of it. Stay tuned!

Another bit of news I think is worth mentioning is how impressed I’ve been with OpenAI’s Codex CLI. I’ve been an avid user of Claude Code for months, but the recent shift in usage limits has left a bad taste so I’ve been trying out Codex. The usage limits are far more generous and the model overall has been pretty impressive. If you haven’t given it a try yet, I think it’s worth checking out!

The Backend Lowdown: Chapter 4 Preview

Every newsletter will include a snippet from my book in progress, The Backend Lowdown, available for $5 right now on Gumroad!

Get The Backend Lowdown →

Stampede Control

When a popular cache key expires, hundreds of concurrent requests might all miss the cache at the same moment. They'll all try to recompute the same expensive value simultaneously, crushing your backend: this is the "thundering herd" problem. Two simple techniques prevent this disaster:

  • Race-condition TTL: When a key expires, keep serving the stale value for a few extra seconds while one request refreshes it in the background.

  • Single-flight lock: Use a distributed lock to ensure only one request recomputes a value. Everyone else waits for that result.

When it helps

  • High-traffic endpoints where many users request the same cached data

  • Expensive cache refills that hit multiple services or databases

  • After deployments or bulk cache clears when your entire cache is cold

  • Any time a cache miss is significantly more expensive than a cache read

Example - race-condition TTL (easiest win)

def fetch_product(id, locale:)
  key      = "v3:product:#{id}:#{locale}"
  base_ttl = 5.minutes
  jitter   = rand(0..30).seconds

  Rails.cache.fetch(
    key,
    expires_in: base_ttl + jitter,
    race_condition_ttl: 10.seconds # Serve stale data for 10s while ONE request refreshes
  ) do
    # This expensive block only runs for one request, not the whole herd
    ProductPresenter.new(Product.find(id), locale: locale).as_json
  end
end

Example - single-flight with Redis NX lock

# Ensures only one request recomputes a value, even when cache is completely empty
LOCK_TTL = 15 # seconds - should exceed typical compute time

def with_single_flight(lock_key, ttl: LOCK_TTL)
  # Try to atomically acquire lock: SET key 1 NX EX ttl
  # NX = only set if not exists, EX = auto-expire after ttl
  if redis.set(lock_key, 1, nx: true, ex: ttl)
    begin
      yield  # We got the lock! Do the work
    ensure
      redis.del(lock_key)  # Release lock immediately when done
    end
  else
    nil  # Someone else is already computing
  end
end

def fetch_with_single_flight(key, ttl:, &compute)
  # Try reading from cache first (happy path)
  cached = Rails.cache.read(key)
  return cached if cached

  # Cache miss: try to become the sole refresher
  lock_key = "lock:#{key}"
  if result = with_single_flight(lock_key) { compute.call }
    Rails.cache.write(key, result, expires_in: ttl + rand(0..ttl/10.0))
    return result
  end

  # Someone else is computing - wait briefly for their result
  3.times do
    sleep 0.05  # 50ms wait
    cached = Rails.cache.read(key)
    return cached if cached
  end

  # Last resort: compute without lock (prevents total failure)
  # Could also return a degraded response here
  compute.call
end

Using both together (recommended for very hot keys)

  • Enable race_condition_ttl on reads as a default.

  • Add single-flight on known expensive keys (or around refill paths) to guarantee only one recompute even when the entry is fully cold.

Implementation notes

  • Set race_condition_ttl to slightly exceed your typical refill time. If refills usually take 2 seconds, use 5-10 seconds to be safe.

  • Size your lock TTL similarly: it should outlive most refills but not so long that a crashed process blocks everyone. 15-30 seconds is often reasonable.

  • Always combine stampede control with TTL jitter. Otherwise, you're just moving the thundering herd from one moment to another.

  • For ultra-hot keys, consider refresh-ahead: when serving a cached value that's close to expiry (say, 80% through its TTL), queue a background job to refresh it before it expires.

  • Add logging for stampede events: track when you serve stale data and when processes wait for locks. These metrics are canaries for performance problems.

That’s a wrap for this week. If something here made your day smoother, feel free to reply and tell me about it. And if you think a friend or teammate would enjoy this too, I’d be grateful if you shared it with them.

Until next time,
Steven