What “Performance” Means in a Django ORM Backend
Most backend slowdowns in Django apps come from database work: too many queries, queries that fetch too much data, missing indexes, or repeated work that could be cached. The Django ORM makes it easy to write correct code quickly, but it can also hide expensive query patterns unless you inspect what’s actually happening.
In this chapter you’ll learn to recognize common bottlenecks, fix them with ORM tools like select_related() and prefetch_related(), add indexes for common filters, and use a repeatable workflow to measure improvements and prevent regressions.
Core Bottleneck #1: N+1 Queries
What N+1 Looks Like
N+1 happens when you run 1 query to fetch a list of objects, then run an additional query per object to fetch related data. It often appears in list endpoints, admin-like pages, or API responses that include related fields.
Example scenario: you list Order objects and show each order’s customer.email. If Order.customer is a foreign key, accessing order.customer without eager loading can trigger one query per order.
# Bad: likely N+1 if you later access order.customer in a loop/serializer/template
orders = Order.objects.filter(status="paid")
for order in orders:
print(order.customer.email)Fix N+1 with select_related (ForeignKey / OneToOne)
Use select_related() when the relationship is single-valued (ForeignKey, OneToOne). It performs a SQL JOIN and fetches related rows in the same query.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
# Good: 1 query total (typically)
orders = (
Order.objects
.filter(status="paid")
.select_related("customer")
)
for order in orders:
print(order.customer.email)Step-by-step workflow:
- Identify where you iterate over a queryset and access a related field.
- Determine relationship type: ForeignKey/OneToOne →
select_related. - Add
.select_related(...)for the needed relation(s). - Re-check query count (see “Query Inspection” below).
Fix N+1 with prefetch_related (ManyToMany / reverse FK)
Use prefetch_related() when the relationship is multi-valued (ManyToMany, reverse ForeignKey). It runs a second query to fetch all related rows in bulk, then matches them in Python.
# Example: BlogPost has many Comment (reverse FK)
posts = BlogPost.objects.filter(published=True)
for post in posts:
# Without prefetch_related, this can be N+1
print(post.comment_set.count())# Good: 2 queries total (posts + all comments)
posts = (
BlogPost.objects
.filter(published=True)
.prefetch_related("comment_set")
)
for post in posts:
print(len(post.comment_set.all()))Note: calling .count() on a related manager can still hit the DB depending on context. If you need counts for many objects, consider annotating counts (see below) or using prefetched data carefully.
Advanced Prefetch: Prefetch Objects for Filtering/Ordering Related Data
Sometimes you want only a subset of related objects (e.g., only approved comments). Use Prefetch to control the related queryset and optionally store it on a custom attribute.
from django.db.models import Prefetch
posts = BlogPost.objects.prefetch_related(
Prefetch(
"comment_set",
queryset=Comment.objects.filter(approved=True).order_by("-created_at"),
to_attr="approved_comments",
)
)
for post in posts:
# Uses in-memory list; no extra queries
print(len(post.approved_comments))Core Bottleneck #2: Unnecessary Database Hits
Repeated Evaluation of the Same Queryset
Django querysets are lazy. Each time you evaluate them (iterate, list(), len(), bool(), slicing in some cases), you may trigger a query. A common issue is evaluating the same queryset multiple times in one request.
# Potentially runs the same query twice
qs = Order.objects.filter(status="paid")
if qs:
for order in qs:
...Better patterns:
# Option A: evaluate once
orders = list(Order.objects.filter(status="paid"))
if orders:
for order in orders:
...# Option B: use .exists() for the check, then fetch only if needed
qs = Order.objects.filter(status="paid")
if qs.exists():
for order in qs:
...Fetching Too Many Columns
If you only need a few fields, avoid loading entire model rows (especially if models have large text fields or many columns).
only()loads a subset of fields but can cause extra queries if you later access deferred fields.values()/values_list()returns dictionaries/tuples and can be efficient for read-only endpoints.
# Efficient for read-only list responses
rows = (
Customer.objects
.filter(active=True)
.values_list("id", "email")
)Using count() vs exists() vs len()
.exists()is usually the cheapest way to check if any row matches..count()runsCOUNT(*); good when you truly need the count.len(qs)evaluates the queryset and loads rows; expensive for large sets.
# Good
if Order.objects.filter(status="paid").exists():
...
# Good when you need the number
paid_count = Order.objects.filter(status="paid").count()Core Bottleneck #3: Inefficient Filtering and Missing Indexes
Recognize “Slow Filters”
Filters become slow when they scan many rows. Common causes:
- Filtering on non-indexed columns that are used frequently.
- Filtering on computed expressions (e.g., applying functions to columns) that prevent index usage.
- Using
icontainson large text fields without specialized indexing. - Sorting (
order_by) large result sets without indexes that support the ordering.
Add Indexes for Common Query Patterns
Indexes are a database feature, but Django lets you define them in models. Add indexes for fields frequently used in filter(), exclude(), order_by(), and joins.
Examples:
from django.db import models
class Order(models.Model):
status = models.CharField(max_length=20, db_index=True)
created_at = models.DateTimeField(db_index=True)
customer = models.ForeignKey("Customer", on_delete=models.CASCADE)
class Meta:
indexes = [
models.Index(fields=["status", "created_at"]),
]Step-by-step workflow for indexes:
- List your slow endpoints and the exact ORM filters/sorts they use.
- Identify the most common WHERE clauses and ORDER BY fields.
- Add
db_index=Truefor single-field patterns; useMeta.indexesfor multi-column patterns. - Create and apply migrations.
- Verify improvement by timing and by checking the query plan (see below).
Use explain() to Inspect Query Plans
Django supports QuerySet.explain() to ask the database how it will execute a query. This is one of the fastest ways to confirm whether an index is being used.
qs = Order.objects.filter(status="paid").order_by("-created_at")
print(qs.explain())Look for signs like “Index Scan” (good) versus “Seq Scan” / “Full Table Scan” (often a sign you need an index or a different query shape).
Query Inspection: Seeing What the ORM Actually Does
Print the SQL for a Queryset
When debugging, you can inspect the SQL generated by the ORM:
qs = Order.objects.filter(status="paid").select_related("customer")
print(str(qs.query))This helps you verify JOINs, WHERE clauses, and ordering.
Django Debug Toolbar (Development)
In development, Django Debug Toolbar is a common tool to inspect:
- Number of SQL queries per request
- Time spent per query
- Duplicate queries
- Stack traces showing where queries originate
Practical usage checklist:
- Load the slow page/endpoint in development.
- Open the SQL panel and sort by time.
- Look for repeated similar queries (classic N+1).
- Apply
select_related/prefetch_relatedand re-check. - Confirm query count and total SQL time decreased.
Capture Queries in Tests (Regression Control)
To prevent performance regressions, add tests that assert query counts for critical code paths.
from django.test import TestCase
class OrderListPerfTests(TestCase):
def test_order_list_query_count(self):
with self.assertNumQueries(2):
# Example: 1 query for orders + 1 for prefetched items
response = self.client.get("/orders/")
self.assertEqual(response.status_code, 200)Use this sparingly and only for endpoints where query count is stable and meaningful. If query counts vary across databases or feature flags, assert an upper bound by measuring and setting a reasonable threshold.
ORM Patterns That Commonly Cause Hidden Slowness
Annotate for Aggregates Instead of Per-Row Queries
If you need counts/sums per object, avoid calling related queries inside loops. Use annotate() to compute aggregates in SQL.
from django.db.models import Count
posts = (
BlogPost.objects
.filter(published=True)
.annotate(comment_count=Count("comment"))
)
for post in posts:
print(post.comment_count)Be Careful with distinct()
distinct() can be expensive, especially combined with joins and ordering. If you’re using it to “fix duplicates” introduced by joins, consider whether you should restructure the query (e.g., prefetch instead of join, or query the primary table first).
Large IN Clauses and Bulk Operations
Filtering with large lists (id__in=[...]) can become slow. If you’re processing many rows, prefer bulk operations:
bulk_createfor insertsbulk_updatefor updatesupdate()for set-based updates
# Set-based update (single query)
Order.objects.filter(status="pending", created_at__lt=cutoff).update(status="expired")Caching Basics: When and How to Use It
Caching helps when you repeatedly compute the same result (often for read-heavy endpoints). It does not fix inefficient queries by itself; first remove N+1 and add indexes, then cache stable results.
Per-View Caching
Per-view caching stores the entire response for a period of time. It’s useful for endpoints that are identical for many users (or vary only by a small set of parameters).
from django.views.decorators.cache import cache_page
@cache_page(60) # seconds
def product_catalog(request):
...Be careful with personalization and permissions. If the response depends on the logged-in user, you must vary the cache key appropriately (or avoid per-view caching).
Low-Level Caching (Cache API)
Low-level caching lets you cache specific expensive computations, such as a list of IDs, a computed summary, or a rendered fragment.
from django.core.cache import cache
CACHE_KEY = "top_customers:v1"
def get_top_customers():
data = cache.get(CACHE_KEY)
if data is not None:
return data
data = list(
Customer.objects
.filter(active=True)
.order_by("-lifetime_value")
.values_list("id", "email")[:100]
)
cache.set(CACHE_KEY, data, timeout=300)
return dataPractical tips:
- Version your keys (e.g.,
:v1) so you can invalidate by changing the version. - Cache data that is expensive to compute and relatively stable.
- Keep cached payloads small; cache IDs or summaries rather than huge objects.
Pagination: When to Consider It (and What It Fixes)
Pagination reduces response size and database work by limiting rows returned. Consider it when:
- List endpoints can return hundreds/thousands of rows.
- Clients only need the first page most of the time.
- Sorting large datasets is slow.
Typical ORM pattern:
PAGE_SIZE = 50
page = int(request.GET.get("page", 1))
offset = (page - 1) * PAGE_SIZE
qs = Order.objects.order_by("-created_at")
orders = qs[offset:offset + PAGE_SIZE]Note: offset pagination can become slower for very large offsets. If you hit that problem, consider keyset pagination (e.g., “created_at < last_seen”) for better scalability.
A Repeatable Workflow to Improve Performance (and Keep It Improved)
1) Establish a Baseline
- Pick one slow endpoint or page.
- Measure: total response time, number of SQL queries, and total SQL time.
- Record the baseline in a short note (or in an issue tracker): “Before: 120 queries, 450ms SQL time”.
2) Identify the Biggest Cost First
- If query count is high: look for N+1 and duplicates.
- If query count is low but slow: inspect the slowest query and check indexes and query plans.
- If payload is huge: add pagination and reduce selected columns.
3) Apply Targeted Fixes
- N+1: add
select_related/prefetch_related. - Slow filters/sorts: add indexes; rewrite filters to be index-friendly.
- Repeated work: add low-level caching for expensive computations.
4) Re-measure and Compare
- Re-run the same request with the same data volume.
- Compare query count and timing to the baseline.
- If improvement is unclear, inspect SQL again and verify the ORM change actually affected the query.
5) Add Guardrails Against Regressions
- Add
assertNumQueriestests for critical endpoints. - Keep a small “performance checklist” in code review: “Any loops over querysets with related field access? Any new filters without indexes?”
- Use development query inspection tools regularly when changing list endpoints or serializers.
Quick Reference Table: Common Problems and Fixes
| Symptom | Likely Cause | Primary Fix |
|---|---|---|
| Dozens/hundreds of similar queries | N+1 on relations | select_related (FK/O2O), prefetch_related (M2M/reverse FK) |
| Single query is very slow | Missing index, expensive sort/filter | Add index, check explain(), rewrite query |
| Large response and slow rendering | Too many rows/columns | Pagination, values_list/values, limit fields |
| Same expensive computation repeated | No caching | Per-view or low-level caching with sensible keys/timeouts |
| Performance improves then regresses later | No measurement/guardrails | Baseline metrics + query-count tests + review checklist |