What “deployment readiness” means for an Express app
Deployment readiness is the set of behaviors that make your service predictable under real operational conditions: restarts, rolling deploys, load balancers, slow clients, partial outages, and misbehaving dependencies. In practice, it means your app can start safely, report whether it is ready to receive traffic, handle traffic within bounded resources (time, memory, payload size), and stop safely without corrupting work.
This chapter focuses on runtime and operational concerns that are easy to miss in development: graceful shutdown, process signals, timeouts, body size limits, trust proxy, avoiding in-memory state assumptions, health endpoints, startup checks, and safe handling of fatal errors.
Graceful shutdown: stop accepting work, finish what you can, then exit
In production, your process will be terminated intentionally (deployments, autoscaling) and sometimes unexpectedly. A graceful shutdown ensures you stop taking new requests, allow in-flight requests to complete (within a deadline), close keep-alive connections, and release resources (DB pools, message consumers) before exiting.
Step-by-step: implement a shutdown controller
- Track server and connections so you can close them.
- Stop accepting new connections with
server.close(). - Signal “not ready” so load balancers stop routing traffic.
- Wait for in-flight requests up to a timeout; then force close.
- Close external resources (DB pool, Redis, queues) before exit.
import http from 'node:http';
import express from 'express';
const app = express();
// --- Readiness state (used by /readyz) ---
let isReady = false;
// Track in-flight requests
let inFlight = 0;
app.use((req, res, next) => {
inFlight++;
res.on('finish', () => { inFlight--; });
next();
});
// Basic endpoints (see later sections)
app.get('/livez', (req, res) => res.status(200).send('ok'));
app.get('/readyz', (req, res) => {
if (!isReady) return res.status(503).send('not ready');
res.status(200).send('ready');
});
const server = http.createServer(app);
// Track open sockets so we can force-close after a deadline
const sockets = new Set();
server.on('connection', (socket) => {
sockets.add(socket);
socket.on('close', () => sockets.delete(socket));
});
// Example: start after startup checks
async function start() {
// await runStartupChecks();
isReady = true;
server.listen(process.env.PORT || 3000);
}
async function shutdown(signal) {
// Stop being "ready" immediately so traffic drains
isReady = false;
// Stop accepting new connections
server.close(() => {
// All connections closed
});
// Ask keep-alive sockets to close
for (const socket of sockets) {
socket.end();
}
const deadlineMs = 15_000;
const start = Date.now();
// Wait for in-flight requests to finish (bounded)
while (inFlight > 0 && Date.now() - start < deadlineMs) {
await new Promise(r => setTimeout(r, 100));
}
// Force close anything still open
for (const socket of sockets) {
socket.destroy();
}
// Close external resources here (db, redis, consumers)
// await db.close();
process.exit(0);
}
process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT', () => shutdown('SIGINT'));
start();Operational note: choose a shutdown deadline that matches your platform’s termination grace period (e.g., Kubernetes terminationGracePeriodSeconds) and your longest acceptable request duration.
Stop accepting new work at the application level
Even after server.close(), some environments can still deliver requests briefly (or you might have background jobs). A simple “draining” middleware can reject new requests once shutdown begins.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
let isDraining = false;
function drainingGuard(req, res, next) {
if (!isDraining) return next();
res.set('Connection', 'close');
return res.status(503).json({ error: 'server is restarting' });
}
app.use(drainingGuard);
async function shutdown() {
isDraining = true;
// ...then close server, wait, etc.
}Handling process signals correctly
Most orchestrators send SIGTERM to request a clean stop. Developers often only handle SIGINT (Ctrl+C), which is not enough in production. Handle at least SIGTERM and SIGINT. Avoid doing heavy work directly inside the signal handler; call an async shutdown routine and guard against multiple invocations.
let shuttingDown = false;
async function shutdownOnce(signal) {
if (shuttingDown) return;
shuttingDown = true;
await shutdown(signal);
}
process.on('SIGTERM', () => shutdownOnce('SIGTERM'));
process.on('SIGINT', () => shutdownOnce('SIGINT'));If you run Node behind a process manager, ensure it forwards signals to the Node process and does not kill it abruptly.
Timeouts: bound resource usage and prevent stuck requests
Timeouts protect your service from slow clients, slow upstream dependencies, and accidental “infinite” requests. You typically need multiple layers:
- Server timeouts (Node HTTP): headers, keep-alive, request time.
- Application timeouts: per-request deadline enforced in middleware.
- Dependency timeouts: database/HTTP client timeouts (outside Express, but critical).
Node HTTP server timeouts (practical defaults)
These settings help with slowloris-style behavior and hung connections. Values vary by workload; the key is to set them explicitly rather than relying on defaults.
const server = http.createServer(app);
// How long to wait for the complete request headers
server.headersTimeout = 10_000;
// Must be >= headersTimeout; limits time between receiving headers and request completion
server.requestTimeout = 30_000;
// Keep-alive idle timeout for persistent connections
server.keepAliveTimeout = 5_000;Application-level request deadline middleware
This pattern enforces a maximum time per request and ensures you don’t keep working after the client is gone. It’s especially useful for routes that call multiple dependencies.
function requestDeadline(ms) {
return (req, res, next) => {
const timer = setTimeout(() => {
// If headers already sent, just end the response
if (!res.headersSent) {
res.status(503).json({ error: 'request timeout' });
}
// Ensure connection is not kept alive
res.set('Connection', 'close');
}, ms);
res.on('finish', () => clearTimeout(timer));
res.on('close', () => clearTimeout(timer));
next();
};
}
app.use(requestDeadline(20_000));Important: a timeout response does not automatically cancel in-flight work (e.g., DB query). Prefer dependency clients that support cancellation (AbortController) and pass a request-scoped abort signal when possible.
Body size limits: protect memory and parsing costs
Large request bodies can exhaust memory, increase GC pressure, and slow down parsing. Set explicit limits for JSON and URL-encoded payloads. Use different limits per route when needed (e.g., file uploads handled by streaming middleware).
Step-by-step: set global limits and override for specific routes
// Global defaults
app.use(express.json({ limit: '1mb' }));
app.use(express.urlencoded({ extended: true, limit: '100kb' }));
// Example: a route that legitimately needs more (still bounded)
app.post('/api/bulk-import', express.json({ limit: '5mb' }), (req, res) => {
res.status(202).send('accepted');
});Also consider limiting the number of parameters for URL-encoded bodies and query strings at the edge (reverse proxy) to reduce parser work.
trust proxy: correct client IPs, protocol, and secure cookies behind a proxy
In production, Express is often behind a reverse proxy or load balancer that terminates TLS and forwards requests. If you don’t configure trust proxy, Express may treat the proxy as the client, leading to incorrect req.ip, wrong protocol detection (req.secure), and issues with secure cookies and redirects.
Practical configuration patterns
- Single proxy (common):
app.set('trust proxy', 1) - Specific subnets:
app.set('trust proxy', '10.0.0.0/8') - Platform-specific: some PaaS recommend
true, but only do this if you fully trust the network path.
// Typical: one proxy hop (e.g., Nginx/ALB in front)
app.set('trust proxy', 1);
app.get('/debug/ip', (req, res) => {
res.json({ ip: req.ip, ips: req.ips, secure: req.secure, protocol: req.protocol });
});Security note: trusting proxies incorrectly can allow clients to spoof X-Forwarded-For and appear as arbitrary IPs. Keep the trust scope as narrow as possible.
Avoid stateful in-memory assumptions
Production deployments often run multiple instances and can restart at any time. Anything stored only in memory is:
- Not shared across instances (breaks consistency).
- Lost on restart (breaks reliability).
- Hard to scale (sticky sessions become a crutch).
Common pitfalls and safer alternatives
| In-memory approach | Failure mode | Better approach |
|---|---|---|
| Storing sessions in a JS object | Users randomly “log out” on restart; inconsistent across instances | External session store (e.g., Redis) or stateless tokens |
| In-memory rate limit counters | Limits reset on restart; uneven enforcement | Shared store-based counters or edge rate limiting |
| In-memory job queue | Jobs lost on crash; duplicates on retry | Durable queue/broker with acknowledgements |
| In-memory cache as source of truth | Stale/incorrect data after redeploy | Cache as optimization only; backed by DB |
If you must keep in-memory caches, treat them as ephemeral and safe to drop. Ensure correctness does not depend on them.
Readiness and liveness endpoints
Health endpoints let orchestrators and load balancers make routing decisions:
- Liveness (
/livez): “Is the process running and event loop responsive?” It should be cheap and not depend on external services. - Readiness (
/readyz): “Can this instance serve traffic correctly?” It can depend on critical dependencies (DB reachable, migrations applied), but must be fast and bounded.
Implement endpoints with clear semantics
let isReady = false;
app.get('/livez', (req, res) => {
// Keep it simple: if we can respond, we are alive
res.status(200).send('ok');
});
app.get('/readyz', async (req, res) => {
if (!isReady) return res.status(503).send('starting');
// Optional: lightweight dependency check with strict timeout
try {
await withTimeout(checkDbPing(), 500);
res.status(200).send('ready');
} catch {
res.status(503).send('degraded');
}
});
function withTimeout(promise, ms) {
return Promise.race([
promise,
new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), ms))
]);
}
async function checkDbPing() {
// Example placeholder; implement using your DB client
return true;
}Tip: keep readiness checks lightweight. If every readiness probe performs a heavy query, you can overload your own dependencies during scaling events.
Startup checks: fail fast before accepting traffic
A reliable service should detect misconfiguration and missing dependencies at startup, then refuse to become “ready” until checks pass. This prevents serving partial functionality and producing confusing errors.
Step-by-step: gate readiness on startup checks
- Validate required configuration (presence, format).
- Initialize dependencies (DB connection pool, cache client).
- Run critical checks (DB reachable, required tables/migrations present if applicable).
- Only then set readiness to true and start listening (or start listening but keep
/readyzfalse until checks pass).
async function runStartupChecks() {
// 1) Validate config (example)
if (!process.env.DATABASE_URL) throw new Error('DATABASE_URL missing');
// 2) Initialize dependencies
// await db.connect();
// 3) Verify critical dependency is reachable
// await db.ping();
}
async function start() {
await runStartupChecks();
isReady = true;
server.listen(process.env.PORT || 3000);
}If startup checks fail, exit with a non-zero code so the orchestrator can restart or alert appropriately.
Uncaught exceptions and unhandled rejections: crash safely, don’t limp
Some errors indicate your process is in an unknown state (e.g., programmer error, corrupted invariants). Continuing to serve traffic after an uncaught exception can cause data corruption, inconsistent responses, or security issues. The safer strategy is:
- Log the error with high severity.
- Stop accepting traffic (mark not ready, close server).
- Exit with non-zero code after a short grace period.
Practical pattern: fatal error handler that triggers shutdown
function installFatalHandlers() {
process.on('uncaughtException', (err) => {
// Log synchronously if possible; avoid complex async work here
console.error('uncaughtException', err);
triggerFatalShutdown('uncaughtException');
});
process.on('unhandledRejection', (reason) => {
console.error('unhandledRejection', reason);
triggerFatalShutdown('unhandledRejection');
});
}
let fatalTriggered = false;
function triggerFatalShutdown(source) {
if (fatalTriggered) return;
fatalTriggered = true;
// Mark not ready so traffic drains
isReady = false;
isDraining = true;
// Try graceful shutdown, but ensure we exit
const hardExitTimer = setTimeout(() => process.exit(1), 10_000);
hardExitTimer.unref();
shutdown(source).catch(() => {
process.exit(1);
});
}
installFatalHandlers();Why exit? After an uncaught exception, you cannot reliably know what invariants are broken. Let your orchestrator replace the instance.
Operational knobs that affect reliability
Disable identifying headers and ensure consistent proxy behavior
For operational consistency, explicitly set behaviors that differ between environments.
// Avoid leaking framework info
app.disable('x-powered-by');
// Ensure correct scheme/IP behind proxy
app.set('trust proxy', 1);Keep request parsing and CPU-heavy work off the hot path
Large JSON parsing and CPU-heavy transformations can block the event loop. Prefer streaming for large payloads and move CPU-heavy work to background workers when possible. At minimum, enforce body limits and timeouts so one request cannot monopolize the process.
Practical pre-deploy checklist
| Area | Check | What to verify |
|---|---|---|
| Graceful shutdown | Signal handling | SIGTERM/SIGINT trigger shutdown; shutdown is idempotent; readiness flips to false immediately |
| Graceful shutdown | Connection draining | server.close() used; keep-alive sockets end; forced close after deadline; in-flight requests tracked |
| Health endpoints | Liveness | /livez returns 200 quickly without dependency calls |
| Health endpoints | Readiness | /readyz returns 503 until startup checks pass; returns 503 during shutdown/drain |
| Startup | Startup checks | Required config validated; dependency init/ping performed; failures exit non-zero |
| Timeouts | HTTP server timeouts | headersTimeout, requestTimeout, keepAliveTimeout set explicitly |
| Timeouts | Request deadline | App-level request timeout middleware present; long operations respect cancellation where possible |
| Payload safety | Body limits | express.json/urlencoded limits set; special routes override with bounded limits; large uploads use streaming |
| Proxy awareness | trust proxy | Configured to match actual proxy hops/subnets; req.ip and req.secure behave correctly |
| State | No in-memory source of truth | Sessions, rate limits, queues, and critical caches are not process-local; restarts don’t break correctness |
| Error survivability | Fatal error strategy | uncaughtException/unhandledRejection handlers log and trigger safe shutdown; process exits non-zero |
| Middleware | Operational middleware order | Draining guard early; body parsers configured; health routes accessible; timeouts applied consistently |
| Security | Production toggles | x-powered-by disabled; proxy configuration correct for secure cookies/redirects |
| Logging | Shutdown/fatal logs | Shutdown start/end and fatal triggers are logged with enough context to debug restarts |
| Configuration | Environment parity | Port binding, proxy settings, timeouts, and limits are set via config and match production expectations |