stevencodes.swe
Posts
stevencodes.swe - Nov 9, 2025

stevencodes.swe - Nov 9, 2025

More queue design tips, book snippet

November 09, 2025

👋 Hey friends,

Here’s what I’ve got in store for you this week:

A snippet from Chapter 4 of The Backend Lowdown
A bit more about queue design

Let’s get into it 👇

The Backend Lowdown: Chapter 4 Preview

Every newsletter will include a snippet from my book in progress, The Backend Lowdown, available for $5 right now on Gumroad!

Get The Backend Lowdown →

Observability & Kill Switches

Caching without monitoring is like driving blindfolded... you won't know if you're helping or hurting until something breaks. Build a thin layer of instrumentation around your cache to track what's working, catch problems early, and disable problematic caches instantly when things go wrong.

This isn't about complex monitoring, just a few key signals and circuit breakers that answer critical questions: Is the cache actually making things faster? Are users getting stale data? Can we turn it off RIGHT NOW if needed?

When it helps

You've deployed a new cache and need proof it's actually improving performance (not just adding complexity)
Production is on fire: users report stale data or Redis is down and you need to bypass cache immediately
You're fine-tuning cache parameters (TTLs, stampede controls) and want real data instead of guessing
You need to debug why specific users are seeing wrong cached data

Add visible traces for debugging

Make cache behavior observable without digging through logs:

# Add response headers (great for APIs)
response.headers['X-Cache'] = 'hit'    # or: miss, stale, bypass, error

# Structured logging on hot paths
Rails.logger.info("Request served", cache: "hit", namespace: "product", key: key)

# Sample 0.1% of keys to catch cardinality explosions
track_key_pattern(key) if rand < 0.001

Instrumentation Example

# Hook into Rails cache events to track all cache operations automatically
ActiveSupport::Notifications.subscribe(/cache_(read|write|fetch_hit|generate)/) do |name, start, finish, _id, payload|
  # Extract namespace from key for grouping metrics (e.g., "production:v3:product:acme:123" → "production:v3:product:acme:123")
  # This helps identify which cache patterns are hot
  ns = payload[:key]&.split(":")&.first(5)&.join(":") || "unknown"
  
  # Calculate operation duration in milliseconds
  dur = ((finish - start) * 1000).round

  case name
  when "cache_fetch_hit.active_support"
    # Cache hit via Rails.cache.fetch (found in cache)
    Metrics.increment("cache.hit", tags: ["ns:#{ns}"])
    
  when "cache_read.active_support"  
    # Direct cache read (could be hit or miss, check payload[:hit])
    if payload[:hit]
      Metrics.increment("cache.hit", tags: ["ns:#{ns}"])
    end
    
  when "cache_generate.active_support"
    # Cache miss - had to run the expensive block in Rails.cache.fetch
    Metrics.increment("cache.miss", tags: ["ns:#{ns}"])
    Metrics.histogram("cache.refill.ms", dur, tags: ["ns:#{ns}"])
    
  when "cache_write.active_support"
    # Direct cache write (from write-through or manual writes)
    Metrics.increment("cache.write", tags: ["ns:#{ns}"])
    Metrics.histogram("cache.write.ms", dur, tags: ["ns:#{ns}"])
  end
  
  # Optional: Track slow cache operations
  if dur > 100  # Over 100ms is concerning
    Rails.logger.warn("Slow cache operation", 
      operation: name, 
      duration_ms: dur, 
      namespace: ns
    )
  end
end

In your controller/view layer for API responses:

# Add cache status to API response headers for debugging
# Clients/developers can see X-Cache: hit|miss|stale in responses
class ApplicationController < ActionController::Base
  after_action :add_cache_status_header
  
  private
  
  def add_cache_status_header
    # Only add header if we tracked cache behavior for this request
    if request.env["x_cache_status"]
      response.set_header("X-Cache", request.env["x_cache_status"])
    end
  end
end

# Enhanced cache fetch that tracks hit/miss for the current request
def cache_fetch_with_tracking(key, **opts, &block)
  # Try reading first to check if it's a hit
  cached_value = Rails.cache.read(key)
  
  if cached_value
    request.env["x_cache_status"] = "hit"
    return cached_value
  end
  
  # Cache miss - compute and store
  request.env["x_cache_status"] = "miss"
  Rails.cache.fetch(key, **opts, &block)
end

# Usage in your controller:
def show
  @product = cache_fetch_with_tracking(
    "product:#{params[:id]}", 
    expires_in: 5.minutes
  ) do
    Product.find(params[:id]).to_presenter
  end
end

# API response will include: X-Cache: hit (or miss)
# Great for debugging, performance testing, and monitoring cache effectiveness

More About Queue Design

This week’s video on planning capacity using Little’s Law did pretty well! This week I’ll be doing another video on queue design. Little’s Law helped us plan capacity by showing us that the jobs in the system is equal to the arrival rate times the average time in the system. But in reality, jobs don’t always arrive in a constant, steady stream.

Kingman’s Formula explains latency when the real world gets messy. In short:

wait time = busy factor × variability factor × service time

As utilization (ρ) creeps toward 1 and arrivals/jobs get bursty or uneven, queue delay explodes, even at the same average load.

Plain formula:

Wq ≈ (ρ/(1−ρ)) · ((Ca² + Cs²)/2) · (1/μ)
(ρ: utilization, Ca/Cs: variability of arrivals/service, μ: worker speed)

What to do:

Smooth arrivals (using token bucket / rate limits)
Tame job size (split “elephants”, batch tiny work)
Keep headroom (aim ρ ≲ 0.7–0.8; add workers before redline)
Add jitter to retries to avoid self-DDOS

If you haven’t seen the queue design video, you can check it out here:

@stevencodes.swe
How to design your queues so they actually hit your SLAs. We go over two guiding principles: queue by SLA and sizing with Little’s Law. Sa... See more

That’s a wrap for this week. If something here made your day smoother, feel free to reply and tell me about it. And if you think a friend or teammate would enjoy this too, I’d be grateful if you shared it with them.

Until next time,
Steven