Indexing for Temporal Queries and Stream Reads

Capítulo 7

Estimated reading time: 13 minutes

+ Exercise

Why temporal and stream-read indexing is different

Temporal queries and stream reads look similar because both are “time-shaped,” but they stress PostgreSQL in different ways. Temporal queries typically ask: “What was true at time T?” or “What changed between T1 and T2?” Stream reads typically ask: “Give me the next N events after my cursor, in order, as fast as possible.” The indexing strategy must therefore optimize two distinct access patterns:

  • Range scans over time: filter by a time window and often by an entity or category, then aggregate or reconstruct state.
  • Ordered incremental scans: fetch the next page of events after a cursor (time-based, sequence-based, or both), usually with a stable order and minimal random I/O.

This chapter focuses on indexes that make those patterns fast and predictable, and on how to verify that PostgreSQL is actually using them.

Core index building blocks for temporal workloads

B-tree for ordered scans and equality + range

B-tree indexes are the default choice for temporal access because they support ordered scans and efficient range conditions. They are ideal when your predicates look like:

  • occurred_at >= $from AND occurred_at < $to
  • entity_id = $id AND occurred_at >= $from
  • stream_id = $stream AND (occurred_at, event_id) > ($cursor_time, $cursor_id)

They also support index-only scans when the query can be satisfied from the index plus the visibility map (more on that below).

BRIN for large, append-heavy time series

BRIN (Block Range INdex) is designed for very large tables where rows are physically correlated with the indexed column (commonly time). BRIN is tiny and cheap to maintain, and it can prune large portions of the table for time-range queries. It is not as precise as B-tree; it narrows down which heap blocks might contain matching rows, then PostgreSQL checks those blocks.

Continue in our app.
  • Listen to the audio with the screen off.
  • Earn a certificate upon completion.
  • Over 5000 courses for you to explore!
Or continue reading below...
Download App

Download the app

BRIN is often a good fit when:

  • Data is inserted in roughly increasing occurred_at order.
  • Queries are mostly time-window scans over large ranges.
  • You want minimal write overhead.

BRIN is less ideal for tight “next page after cursor” reads where you need exact ordering and small page sizes; B-tree usually wins there.

GiST/SP-GiST for range types (valid time)

If you model validity as a range (for example valid_during tstzrange), GiST indexes can accelerate “contains” and “overlaps” queries. This is common in bitemporal or slowly-changing facts where you ask “which version was valid at time T?”

Typical predicate: valid_during @> $as_of_time

Indexing for “read the stream” queries

Define a stable cursor and order

Stream reads need a stable, total order. In practice you usually order by a time column plus a tie-breaker that is unique and monotonic enough for your stream. A common pattern is:

  • ORDER BY occurred_at, event_id

Where event_id is unique (often a bigserial or generated id). Even if timestamps collide, the pair gives a strict order.

Create a composite B-tree index matching the order

For a stream read that filters by stream key (for example stream_id) and then orders by time and id, create an index that starts with the filter columns and then the order columns:

CREATE INDEX CONCURRENTLY events_stream_cursor_idx  ON events (stream_id, occurred_at, event_id);

This index supports:

  • Finding the first row for a given stream quickly.
  • Continuing from a cursor using a multi-column comparison.
  • Returning rows already in the desired order (no sort step).

Use keyset pagination (seek method), not OFFSET

OFFSET pagination forces PostgreSQL to scan and discard rows, which becomes slower as you move forward. Keyset pagination uses the last seen key as a cursor:

SELECT event_id, occurred_at, event_type, payload  FROM events  WHERE stream_id = $1    AND (occurred_at, event_id) > ($2, $3)  ORDER BY occurred_at, event_id  LIMIT 500;

With the composite index above, PostgreSQL can do an index range scan that starts exactly at the cursor position.

Step-by-step: verify the plan uses the index

1) Ensure the index exists and is valid.

\d+ events

2) Run the query with EXPLAIN (ANALYZE, BUFFERS) in a staging environment.

EXPLAIN (ANALYZE, BUFFERS) SELECT event_id, occurred_at, event_type  FROM events  WHERE stream_id = 42    AND (occurred_at, event_id) > ('2026-01-01T00:00:00Z', 1000000)  ORDER BY occurred_at, event_id  LIMIT 500;

3) Look for an Index Scan or Index Only Scan on events_stream_cursor_idx and confirm there is no explicit Sort node. Also check buffers: a good plan for “next page” reads should touch few heap pages.

Covering indexes for stream reads (INCLUDE)

If your consumer needs a small set of columns, you can reduce heap access by using an index that “covers” the query. PostgreSQL supports INCLUDE columns that are stored in the index leaf pages but not part of the sort key.

CREATE INDEX CONCURRENTLY events_stream_cursor_cover_idx  ON events (stream_id, occurred_at, event_id)  INCLUDE (event_type);

Now a query that selects only event_id, occurred_at, and event_type can often be served via an index-only scan, provided the visibility map is sufficiently set (which depends on vacuuming and how recently pages were modified).

Practical guidance:

  • Keep INCLUDE columns small and frequently read.
  • Avoid including large payload columns; it bloats the index and increases write cost.
  • Measure: index-only scans are not guaranteed; they depend on MVCC visibility.

Partial indexes for “hot” subsets

Stream consumers often read only recent events (for example, last 7 days) or only a subset of event types. A partial index can speed up those reads while keeping index size small.

CREATE INDEX CONCURRENTLY events_recent_stream_idx  ON events (stream_id, occurred_at, event_id)  WHERE occurred_at >= now() - interval '7 days';

Notes:

  • Partial indexes are most effective when the predicate is stable and matches the query exactly.
  • Because now() changes, consider using a fixed boundary updated periodically (for example, recreate the index monthly) or use a predicate based on a persisted “bucket” column (like occurred_date).

Indexing for time-window analytics and temporal filters

Time-range scans with B-tree

For queries that filter by time window and then group or aggregate, a simple B-tree on occurred_at can help, but it may not be selective enough if the window is large. Still, it can be valuable when combined with another filter (like event_type or entity_id).

CREATE INDEX CONCURRENTLY events_time_idx ON events (occurred_at);

Better for common patterns:

CREATE INDEX CONCURRENTLY events_type_time_idx ON events (event_type, occurred_at);

This supports queries like:

SELECT date_trunc('hour', occurred_at) AS hour, count(*)  FROM events  WHERE event_type = 'payment_authorized'    AND occurred_at >= $1 AND occurred_at < $2  GROUP BY 1  ORDER BY 1;

BRIN for wide windows and massive tables

When the table is huge and inserts are append-like, BRIN can be the difference between scanning everything and scanning only the relevant block ranges.

CREATE INDEX CONCURRENTLY events_occurred_at_brin  ON events USING brin (occurred_at)  WITH (pages_per_range = 64);

pages_per_range controls granularity. Smaller values make the index more precise but larger; larger values make it smaller but less selective. A practical step-by-step approach:

1) Start with a moderate value like 64 or 128.

2) Run representative time-window queries with EXPLAIN (ANALYZE, BUFFERS).

3) If too many heap blocks are visited, reduce pages_per_range.

4) If the BRIN index is large relative to expectations, increase pages_per_range.

5) Periodically run:

VACUUM (ANALYZE) events;  -- and consider brin summarization if needed

BRIN relies on summaries; if data arrives out of order, BRIN may become less effective because block ranges contain mixed time values.

Composite indexes for “entity over time” lookups

A frequent temporal query is: “show me all changes for entity X in a time window.” Index for that by placing the equality column first:

CREATE INDEX CONCURRENTLY events_entity_time_idx  ON events (entity_id, occurred_at, event_id);

This supports both:

  • Time-window scans for a single entity.
  • Cursor-based reads per entity (if you treat entity as a stream).

Be careful about creating too many similar composite indexes. Each additional index increases write amplification. Prefer indexes that serve multiple critical queries.

Indexing “as-of” queries with range validity

If you store versions with a validity range, you can accelerate “as-of time” lookups using GiST. Example table:

CREATE TABLE entity_versions (  entity_id bigint NOT NULL,  valid_during tstzrange NOT NULL,  version_id bigint NOT NULL,  data jsonb NOT NULL,  PRIMARY KEY (entity_id, version_id));

Create a GiST index on the range, typically combined with entity_id via a multicolumn GiST (supported) or via separate indexes depending on query shape. A common approach is:

CREATE INDEX CONCURRENTLY entity_versions_valid_gist  ON entity_versions USING gist (valid_during);  CREATE INDEX CONCURRENTLY entity_versions_entity_btree  ON entity_versions (entity_id);

Query:

SELECT *  FROM entity_versions  WHERE entity_id = $1    AND valid_during @> $2  ORDER BY lower(valid_during) DESC  LIMIT 1;

Practical note: ordering by range bounds may still require extra work; if you frequently need “latest valid at time T,” consider also storing a generated column for valid_from and indexing (entity_id, valid_from DESC) while still using the range for correctness checks.

Handling ties, clock skew, and “same timestamp” bursts

Temporal indexing often assumes time is strictly increasing, but real systems produce bursts with identical timestamps and sometimes out-of-order arrivals. Indexing should still guarantee deterministic reads:

  • Always include a tie-breaker in the ordering and index (for example event_id).
  • When consumers rely on “read after cursor,” use the same tuple ordering in both the query and the index.
  • If you have a separate ingestion time and event time, decide which one drives stream reads. Many systems read by ingestion order for operational consumers and use event time for analytics.

Example of ingestion-ordered stream read:

CREATE INDEX CONCURRENTLY events_stream_ingest_idx  ON events (stream_id, ingested_at, event_id);

Then read with (ingested_at, event_id) as the cursor key.

Index maintenance considerations for high write rates

Write amplification and choosing the minimum viable set

Every index must be updated on insert, and potentially on update. For append-heavy event tables, inserts dominate, so the cost is predictable but can be large. A practical approach to index selection:

  • List your top 3–5 critical query patterns (stream reads, entity timeline, time-window aggregation, etc.).
  • Create one index per pattern, but try to reuse one composite index for multiple patterns.
  • Prefer BRIN for broad time-window filtering when B-tree would be huge and not selective enough.
  • Avoid indexing large payload fields; instead index small extracted columns that represent the filter keys.

Fillfactor and page splits

B-tree indexes can suffer from page splits if inserts land in the middle of the index order. If your index order is mostly increasing (time plus increasing id), page splits are reduced. If you index on a column with random distribution (like UUID) as the leading key, you can get heavy churn.

If you must use a random leading key, consider:

  • Using a different leading key that matches the access pattern.
  • Setting a lower fillfactor on the index to leave room for inserts (trade-off: larger index).
CREATE INDEX CONCURRENTLY events_stream_cursor_idx  ON events (stream_id, occurred_at, event_id)  WITH (fillfactor = 90);

VACUUM, visibility map, and index-only scans

Index-only scans are attractive for stream reads, but they depend on the visibility map indicating that heap pages contain only tuples visible to all transactions. If the table is constantly receiving inserts, recent pages may not be marked all-visible yet. Practical steps:

  • Ensure autovacuum is tuned to keep up with insert volume.
  • Expect index-only scans to work better for slightly older data than for the newest pages.
  • Measure with EXPLAIN (ANALYZE, BUFFERS): if you see many heap fetches, the index is not fully “covering” in practice.

Designing indexes around common temporal query templates

Template 1: “Events for a stream between two times”

Query:

SELECT event_id, occurred_at, event_type  FROM events  WHERE stream_id = $1    AND occurred_at >= $2 AND occurred_at < $3  ORDER BY occurred_at, event_id;

Index:

CREATE INDEX CONCURRENTLY events_stream_time_idx  ON events (stream_id, occurred_at, event_id);

Template 2: “Next page after cursor”

Query:

SELECT event_id, occurred_at, event_type  FROM events  WHERE stream_id = $1    AND (occurred_at, event_id) > ($2, $3)  ORDER BY occurred_at, event_id  LIMIT $4;

Index: same as Template 1. This reuse is a key reason to prefer the composite index.

Template 3: “Count events by type in a time window”

Query:

SELECT event_type, count(*)  FROM events  WHERE occurred_at >= $1 AND occurred_at < $2  GROUP BY event_type;

Index options:

  • If the window is narrow and you frequently filter by type too: (event_type, occurred_at).
  • If the window is wide and the table is huge: BRIN on occurred_at plus possibly a separate B-tree on event_type (planner may bitmap-and them).
CREATE INDEX CONCURRENTLY events_type_time_idx ON events (event_type, occurred_at);  CREATE INDEX CONCURRENTLY events_time_brin ON events USING brin (occurred_at);

Template 4: “Latest event per entity as of time T”

When you need the most recent event before a timestamp for an entity, you can use a descending index to reduce scanning:

CREATE INDEX CONCURRENTLY events_entity_time_desc_idx  ON events (entity_id, occurred_at DESC, event_id DESC);

Query:

SELECT event_id, occurred_at, payload  FROM events  WHERE entity_id = $1    AND occurred_at <= $2  ORDER BY occurred_at DESC, event_id DESC  LIMIT 1;

This pattern is common in “as-of” reconstruction where you want the last known change before a point in time.

Diagnosing slow temporal queries: a practical checklist

1) Confirm the predicate matches the index

PostgreSQL can only use the leftmost prefix of a B-tree index efficiently. If your index is (stream_id, occurred_at, event_id) but your query filters only by occurred_at, it may not use that index as expected. Either add a separate time index (B-tree or BRIN) or rewrite queries to include the leading key when appropriate.

2) Avoid functions on indexed columns in WHERE

Predicates like date_trunc('day', occurred_at) = $day prevent direct use of a plain index on occurred_at. Prefer range predicates:

WHERE occurred_at >= $day_start AND occurred_at < $day_start + interval '1 day'

If you truly need function-based filtering, consider an expression index:

CREATE INDEX CONCURRENTLY events_occurred_day_idx  ON events (date_trunc('day', occurred_at));

Use expression indexes sparingly; they add maintenance cost and can multiply quickly across variants.

3) Watch for sorts and large bitmap heap scans

For stream reads, a Sort node is a red flag: it means the index does not provide the required order. For time-window analytics, a bitmap heap scan can be fine, but if it touches a large fraction of the table, BRIN or different clustering may help.

4) Consider clustering for read-heavy timelines

If you frequently read by (stream_id, occurred_at), clustering the table on that index can improve locality, reducing random I/O:

CLUSTER events USING events_stream_time_idx;

Clustering is a one-time rewrite and does not maintain itself automatically. It is most useful when the table is relatively stable or when you can periodically recluster during maintenance windows.

Putting it together: a minimal, high-impact index set

For many event-driven systems, a pragmatic baseline for temporal and stream-read performance is:

  • Composite B-tree for stream reads: (stream_id, occurred_at, event_id)
  • Composite B-tree for entity timelines (if needed): (entity_id, occurred_at, event_id)
  • BRIN on occurred_at for broad time-window scans on very large tables
  • Optional covering index using INCLUDE for the most common consumer projection

From there, add partial or specialized indexes only when a query is both frequent and slow, and confirm the improvement with EXPLAIN (ANALYZE, BUFFERS) and production-like data volumes.

Now answer the exercise about the content:

When optimizing a stream read that fetches the next page after a cursor for a specific stream_id, which index strategy best supports stable ordering and efficient keyset pagination?

You are right! Congratulations, now go to the next page

You missed! Try again.

Stream reads need a stable total order and fast reads after a cursor. A composite B-tree beginning with the filter key and then the ordered cursor keys lets PostgreSQL do an ordered index range scan without an extra sort step.

Next chapter

Querying Historical State and Point-in-Time Views

Arrow Right Icon
Free Ebook cover Event-Driven Data Modeling with PostgreSQL: Designing for Streams, Time, and Change
47%

Event-Driven Data Modeling with PostgreSQL: Designing for Streams, Time, and Change

New course

15 pages

Download the app to earn free Certification and listen to the courses in the background, even with the screen off.