stevencodes.swe - August 10, 2025

Dev tool rec, weekly video highlight

👋 Hey friends,

Here’s what I’ve got for you this week:

  • Snippet from Chapter 3 of The Backend Lowdown

  • Weekly Video Highlight: Ribbon Filters

  • Dev tool rec: SigNoz

Let’s get into it 👇

The Backend Lowdown: Chapter 3 Preview

Every newsletter will include a snippet from my book in progress, The Backend Lowdown, available for $1 right now on Gumroad!

Pagination Performance Cliffs

The most dangerous performance problems are the ones that work perfectly in development. Pagination is the poster child for this category: your local dataset of 100 records pages beautifully, while your production database with 10 million records is slowly dying as users browse deep into your result sets.

What Goes Wrong

OFFSET/LIMIT pagination looks deceptively simple and works exactly as you'd expect until you start dealing with many records:

-- Page 1: Lightning fast
SELECT * FROM products ORDER BY created_at DESC LIMIT 20 OFFSET 0;
-- Execution time: 2ms

-- Page 500: Database destruction  
SELECT * FROM products ORDER BY created_at DESC LIMIT 20 OFFSET 10000;
-- Execution time: 800ms

The Hidden Cost of OFFSET

Here's what your database actually does with OFFSET 10000:

  1. Builds the full result set from the beginning

  2. Sorts all those rows according to your ORDER BY

  3. Counts and discards the first 10,000 rows

  4. Finally returns rows 10,001-10,020

You're forcing the database to do 99.8% wasted work to return 0.2% of the results.

Think of it like reading a book by starting from page 1 every single time, counting pages until you reach page 500. It doesn't matter that you have a bookmark, OFFSET pagination throws it away and starts counting from the beginning.

Solution 1: Keyset Pagination (The Right Way™)

Instead of counting rows to skip, keyset pagination uses the actual values from your last result as a starting point. It's like using a bookmark instead of counting pages from the beginning.

How it works:

  • Instead of "give me page 5", you say "give me records after ID 12345"

  • The database can jump directly to that ID using an index

  • Performance is O(log N) regardless of how deep you go

# Traditional offset approach - gets slower with each page
def page_one
  Product.order(created_at: :desc).limit(20).offset(0)  # Fast
end

def page_fifty  
  Product.order(created_at: :desc).limit(20).offset(1000)  # Slow - reads 1000 rows first
end

# Keyset approach - equally fast at any depth
def first_page
  Product.order(created_at: :desc).limit(20)
end

def next_page(last_created_at, last_id)
  Product
    .where("(created_at, id) < (?, ?)", last_created_at, last_id)
    .order(created_at: :desc, id: :desc)
    .limit(20)
end

The clever bit: We use (created_at, id) as a composite key. This handles the case where multiple records have the same timestamp - the ID breaks the tie, ensuring stable ordering.

When to use it:

  • Infinite scroll interfaces

  • API pagination where you control the client

  • Any scenario where you don't need to jump to arbitrary pages

The downside:

  • No jumping to page 47 directly

  • URLs aren't shareable (no page numbers)

  • Users can't see "page 3 of 97"

Weekly Video Highlight: Ribbon Filters

It’s been a while since I highlighted a video, but I had a lot of fun researching this one: the Ribbon filter. Like a Bloom filter, it offers O(1) lookups and a tunable false-positive rate (e.g., 1%), but uses ~30% less memory. It gets there by packing items into a banded linear system and solving it at build time. The caveat? Ribbon filters are for static sets (you typically rebuild to add or remove keys) but the engineering behind them is fascinating.

Intrigued? Give the video a watch here:

@stevencodes.swe

Ribbon filters use Gaussian elimination to achieve 30% less memory than Bloom filters with the same false positive rate. Here’s how diagon... See more

Dev Tool Recommendation: SigNoz

I use Datadog at work, but we recently needed to validate new telemetry before opening a PR. Testing traces/metrics on localhost is usually painful.. enter: SigNoz. It’s an open-source, OpenTelemetry-native Datadog alternative with traces, metrics, and logs in one clean UI. Setup was quick, and I could see my local changes immediately in the browser. If you self-host, it’s free (you just cover your own infra); their cloud is paid. If you’re in a similar spot (or just want an OSS observability stack) give SigNoz a look.

That’s a wrap for this week. If something here made your day smoother, feel free to reply and tell me about it. And if you think a friend or teammate would enjoy this too, I’d be grateful if you shared it with them.

Until next time,
Steven