All courses > Technology and Programming > Databases ::

Indexing 101: Common Performance Pitfalls That Defeat Indexes

Capítulo 8

Estimated reading time: 11 minutes

Why “Good Indexes” Still Produce Slow Queries

Indexes speed up queries when the database can use them to quickly narrow down rows (seek) and then read only what’s needed. Many slow queries aren’t caused by missing indexes—they’re caused by query patterns that make existing indexes unusable or less attractive than a scan. This chapter focuses on common anti-patterns that defeat indexes and how to rewrite queries so the optimizer can apply index-friendly filtering.

1) Non-sargable predicates (functions on indexed columns, implicit conversions)

A predicate is sargable when it can use an index to directly locate a range of keys (for example, col >= ... AND col < ...). Predicates become non-sargable when you apply a function to the indexed column or force the engine to transform the column value for every row before comparing.

Anti-pattern A: Wrapping an indexed column in a function

Scenario: You have an index on orders.created_at and want “orders from 2025-01-01”.

Before (non-sargable):

SELECT order_id, customer_id, created_at, total_amount FROM orders WHERE DATE(created_at) = DATE '2025-01-01';

What goes wrong: The database must compute DATE(created_at) for many rows, which typically prevents a direct index range seek on created_at.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

After (sargable range):

SELECT order_id, customer_id, created_at, total_amount FROM orders WHERE created_at >= TIMESTAMP '2025-01-01 00:00:00' AND created_at <  TIMESTAMP '2025-01-02 00:00:00';

Step-by-step rewrite method:

Identify the function applied to the indexed column (DATE(created_at)).
Move the transformation to the constant side by expressing it as a range on the raw column.
Prefer half-open intervals (>= start and < next_start) to avoid time-of-day edge cases.

Anti-pattern B: Implicit conversion on the indexed column

Scenario: users.user_id is an integer and indexed. The application sends '123' as a string.

Before (often non-sargable):

SELECT user_id, email FROM users WHERE user_id = '123';

What goes wrong: Depending on rules, the engine may convert user_id to text for comparison, which can prevent an index seek on the numeric index.

After (typed parameter / typed literal):

SELECT user_id, email FROM users WHERE user_id = 123;

Step-by-step rewrite method:

Check the column type and ensure the predicate value is the same type.
Fix at the source: bind parameters with the correct type in application code.
If you must cast, cast the constant/parameter, not the column (so the column remains indexable).

2) Leading wildcards and patterns that prevent efficient seeks

Indexes on strings are most effective when the search can anchor at the beginning of the value. Patterns like %term or %term% don’t provide a starting point in the index, so the engine often scans many entries.

Anti-pattern: Leading wildcard in LIKE

Scenario: Index on customers.last_name. You want names containing “son”.

Before (scan-prone):

SELECT customer_id, last_name, first_name FROM customers WHERE last_name LIKE '%son%';

What goes wrong: The pattern can match anywhere, so the index can’t jump to a starting key range.

After (anchored prefix search):

SELECT customer_id, last_name, first_name FROM customers WHERE last_name LIKE 'son%';

Step-by-step rewrite method:

Decide whether the business requirement truly needs “contains” or if “starts with” is acceptable.
If “starts with” works, rewrite to 'prefix%' to enable a range seek.
If you truly need “contains”, consider adding a dedicated search structure (for example, a separate normalized token table) rather than expecting a B-tree index to solve substring search.

Anti-pattern: Pattern built dynamically that hides the prefix

Before (prefix not guaranteed):

SELECT product_id, sku FROM products WHERE sku LIKE CONCAT('%', :term, '%');

After (two-mode approach):

-- If term is intended as a prefix: SELECT product_id, sku FROM products WHERE sku LIKE CONCAT(:term, '%');

Practical guidance: If your UI has both “starts with” and “contains”, route them to different query paths. Don’t default everything to “contains” and then wonder why the index isn’t used.

3) Mismatched data types and collation/encoding surprises

Even when values look the same to humans, differences in data type, collation, or encoding can force conversions that block index usage or change comparison semantics. This often appears when joining tables, filtering with parameters, or comparing strings across different collations.

Anti-pattern A: Joining columns with different types

Scenario: orders.customer_id is an integer (indexed). customers.customer_id is stored as text (also indexed). Joining them forces conversion on one side.

Before (conversion on join key):

SELECT o.order_id, c.customer_name FROM orders o JOIN customers c ON o.customer_id = c.customer_id;

What goes wrong: If the engine must convert one join key per row, it may not be able to use an efficient index-based join strategy.

After (align types):

SELECT o.order_id, c.customer_name FROM orders o JOIN customers c ON o.customer_id = CAST(c.customer_id AS INTEGER);

Important note: The best fix is schema alignment (store IDs consistently). If you must cast, cast the side that is smaller or less frequently scanned, and validate that the cast is safe (no non-numeric values).

Anti-pattern B: Collation/encoding mismatch in comparisons

Scenario: users.email is indexed. A query compares it using a different collation or case-folding rule than the index was built for.

Before (forces collation transform):

SELECT user_id FROM users WHERE email COLLATE some_other_collation = :email;

After (match comparison rules to the indexed column):

SELECT user_id FROM users WHERE email = :email;

Step-by-step troubleshooting:

If a string predicate unexpectedly scans, check whether the comparison is case-insensitive/case-sensitive in a way that differs from the index’s rules.
Ensure parameters are sent in the same encoding and normalization form expected by the database.
Prefer storing and indexing a normalized form (for example, a lowercased email column) if your application requires case-insensitive lookups.

4) Selecting too many columns (hurting coverage)

Even when a filter can use an index, selecting many non-indexed columns can make the plan expensive because the engine must fetch the full rows for each match. If many rows match, those extra lookups can dominate runtime and I/O.

Anti-pattern: Using SELECT * for a filtered query

Scenario: Index on orders(customer_id, created_at). You only need a list view with a few fields, but the query selects everything.

Before (forces many row fetches):

SELECT * FROM orders WHERE customer_id = :cid AND created_at >= :start AND created_at < :end ORDER BY created_at DESC;

What goes wrong: The index can find matching keys quickly, but returning all columns often requires fetching the base table row for each match, which can be far more expensive than reading the index alone.

After (project only needed columns):

SELECT order_id, created_at, total_amount FROM orders WHERE customer_id = :cid AND created_at >= :start AND created_at < :end ORDER BY created_at DESC;

Step-by-step improvement process:

List the exact columns needed by the endpoint/report.
Remove SELECT * and return only those columns.
If you still need extra columns, consider whether a different query shape (two-step: fetch IDs then fetch details) reduces total work for your access pattern.

Anti-pattern: Fetching large text/blob columns unnecessarily

Before:

SELECT order_id, created_at, notes_blob FROM orders WHERE customer_id = :cid AND created_at >= :start;

After:

SELECT order_id, created_at FROM orders WHERE customer_id = :cid AND created_at >= :start;

Practical tip: Large columns can turn an otherwise fast index-driven query into a bandwidth and memory problem. Load them only on detail pages, not list pages.

5) OR conditions and complex predicates that change plan choices

OR and complex boolean logic can prevent the optimizer from using a single efficient index access path. Sometimes it will choose a scan because combining multiple index paths is estimated to be more expensive than scanning once.

Anti-pattern A: OR across different columns

Scenario: Index on users(email) and index on users(phone). You want to find a user by either email or phone.

Before (may scan or choose a suboptimal plan):

SELECT user_id, email, phone FROM users WHERE email = :login OR phone = :login;

After (split into two sargable queries):

SELECT user_id, email, phone FROM users WHERE email = :login UNION ALL SELECT user_id, email, phone FROM users WHERE phone = :login AND email <> :login;

Step-by-step rewrite method:

Separate each disjunct into its own query that can use its own index.
Combine results with UNION ALL (and deduplicate if needed).
Add a guard to avoid duplicates when the same value could match both predicates.

Anti-pattern B: OR with a parameter that sometimes disables filtering

Scenario: Optional filter: if :status is null, return all statuses.

Before (common but index-hostile):

SELECT order_id, status, created_at FROM orders WHERE (:status IS NULL OR status = :status) AND created_at >= :start AND created_at < :end;

What goes wrong: The optimizer must consider both cases at once; it may choose a plan that’s “okay” for all statuses but not great for a specific status, or it may avoid using the status index effectively.

After (two query shapes):

-- When :status is provided SELECT order_id, status, created_at FROM orders WHERE status = :status AND created_at >= :start AND created_at < :end; -- When :status is NULL SELECT order_id, status, created_at FROM orders WHERE created_at >= :start AND created_at < :end;

Practical step-by-step:

In application code, branch on whether the optional parameter is present.
Use a simpler query per case so the optimizer can pick a plan tailored to that case.
Keep each predicate sargable and avoid “always true” OR patterns.

Anti-pattern C: Complex expressions in predicates

Before (expression on column):

SELECT invoice_id FROM invoices WHERE (amount - discount) > 100;

After (rewrite to keep column raw when possible):

SELECT invoice_id FROM invoices WHERE amount > 100 + discount;

Note: Not every expression can be algebraically rearranged safely, but when it can, keeping the indexed column unwrapped helps index usage.

Parameter sensitivity (why different values want different plans)

Some queries behave very differently depending on the parameter values. For example, status = 'CANCELLED' might match 0.1% of rows, while status = 'SHIPPED' matches 60%. The “best” plan for a highly selective value is often an index seek with targeted row fetches; the “best” plan for a very common value may be a scan (or a different index) because fetching many rows via the index can be more expensive than reading sequentially.

Conceptual example

Same query shape:

SELECT order_id, created_at, total_amount FROM orders WHERE status = :status AND created_at >= :start AND created_at < :end;

Why it’s tricky:

If :status is rare, the engine benefits from using an index on status (possibly combined with created_at).
If :status is common, the engine may prefer a plan that reads more sequentially and filters, because random lookups become too costly.
If the database reuses a plan chosen for one parameter value, that reused plan can be inefficient for a different value.

Practical ways to reduce surprises (without engine-specific features)

Use different query shapes for different selectivity regimes: for example, route “rare statuses” and “common statuses” to different SQL statements if you know the distribution.
Avoid optional-filter OR patterns: they force a one-size-fits-all plan; branching into separate queries often stabilizes performance.
Keep predicates simple and sargable: it gives the optimizer more viable index-based options for each parameter value.
Test with representative parameter values: don’t benchmark only with a single “happy path” value.

Checklist: Index-friendly query rules

Don’t wrap indexed columns in functions in the WHERE clause; rewrite as ranges on the raw column.
Match parameter types to column types; avoid implicit conversions, especially on join keys and filtered columns.
Prefer anchored patterns ('prefix%') over leading-wildcard patterns ('%term', '%term%') when you want index seeks.
Keep collation/encoding/comparison rules consistent with how the indexed column is stored and queried.
Avoid SELECT *; project only the columns you need, especially on list endpoints.
Be cautious selecting large text/blob columns; fetch them separately when needed.
Rewrite OR across different columns into separate queries combined with UNION ALL when appropriate.
Handle optional filters by generating different SQL statements rather than (:p IS NULL OR col = :p).
When possible, rewrite expressions to keep the indexed column unmodified (move arithmetic to the constant side).
Test performance with multiple parameter values; the best plan for one value may be poor for another.

Now answer the exercise about the content:

Which rewrite is most likely to keep an index on a timestamp column usable when filtering for all rows on a specific date?

You are right! Congratulations, now go to the next page

You missed! Try again.

Applying functions to an indexed column often makes the predicate non-sargable. Using a half-open range keeps the column unmodified, enabling an efficient index range seek and avoiding time-of-day edge cases.