Why Traffic Management Matters for Cloud-Native Web Serving
Traffic management strategies control how requests are distributed across multiple versions of the same application. Instead of switching all users from version A to version B at once, you can progressively shift traffic, isolate risk, validate behavior, and roll back quickly. In Kubernetes-based web serving, these strategies typically operate at the HTTP routing layer (for example, by host, path, headers, cookies, or weights) and are most powerful when combined with a service mesh that can split traffic at the request level and apply policies consistently across services.
In this chapter, you will learn three common strategies—Blue/Green, Canary, and A/B routing—how they differ, when to use each, and how to implement them with practical, step-by-step examples. The focus is on traffic shaping and release control, not on basic Kubernetes networking concepts.
Core Building Blocks: Versions, Subsets, and Routing Signals
Before applying any strategy, you need a consistent way to identify “which version should receive this request.” In practice, you will use labels on Pods (or Deployments) such as version: v1 and version: v2. A service mesh can then define subsets (logical groups of endpoints) based on those labels and route traffic to those subsets by weight or by request attributes.
Routing signals are the inputs used to decide where a request goes. Common signals include: request headers (for example x-user-group: beta), cookies (for sticky experiments), query parameters (useful for manual testing), source identity (service account), and sometimes client IP (less reliable in modern proxy chains). A key design choice is whether routing is deterministic for a user (sticky) or probabilistic (weighted). Deterministic routing is typical for A/B tests; probabilistic routing is typical for canaries.
Example labeling pattern
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-v1
spec:
selector:
matchLabels:
app: web
version: v1
template:
metadata:
labels:
app: web
version: v1
spec:
containers:
- name: web
image: example/web:1.0.0
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-v2
spec:
selector:
matchLabels:
app: web
version: v2
template:
metadata:
labels:
app: web
version: v2
spec:
containers:
- name: web
image: example/web:2.0.0This pattern keeps both versions running at the same time. The traffic strategy determines how requests are split between them.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
Blue/Green Deployments
Blue/Green is a release strategy where you maintain two complete environments (or two complete versions) side-by-side: “Blue” is the currently serving version, and “Green” is the new version. You validate Green while Blue continues to serve production traffic. When ready, you switch traffic from Blue to Green in a single, controlled change. Rollback is fast because you can switch back to Blue.
Blue/Green is ideal when you want a clean cutover, minimal complexity in traffic rules, and a straightforward rollback. It is less ideal when you want to gradually expose users to the new version or run long-lived experiments.
Blue/Green step-by-step: switch with a stable Service selector
One simple approach is to keep a stable Service name (for example web) and change its selector from Blue to Green. This is a “big switch” at the Service level. It is easy to understand and works without advanced routing, but it is not request-by-request splitting; it is effectively a cutover.
- Step 1: Run Blue and Green concurrently. Deploy both versions with labels like
track: blueandtrack: green. - Step 2: Point the stable Service at Blue. The Service selector matches only Blue.
- Step 3: Validate Green out-of-band. Use a separate Service (or port-forward) to test Green without affecting users.
- Step 4: Switch the stable Service selector to Green. This moves all traffic to Green.
- Step 5: Keep Blue for quick rollback. If issues appear, switch the selector back.
Manifests for Service selector switching
apiVersion: v1
kind: Service
metadata:
name: web
spec:
selector:
app: web
track: blue
ports:
- port: 80
targetPort: 8080To cut over, you patch the selector:
kubectl patch service web -p '{"spec":{"selector":{"app":"web","track":"green"}}}'This approach is operationally simple, but note the trade-offs: existing keep-alive connections may continue to hit old endpoints until they reconnect, and you cannot do partial traffic splits. Also, if Blue and Green require different database schema versions, you must plan the data migration carefully (for example, backward-compatible schema changes) because the switch is instantaneous.
Blue/Green with a service mesh: switch at the routing layer
With a service mesh, you can still do a “100% to Green” switch, but you do it in a routing policy rather than by changing Service selectors. This keeps Kubernetes Services stable and moves release control into declarative traffic rules. It also enables safer pre-validation patterns, such as sending only internal test traffic to Green while keeping production traffic on Blue.
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: web
spec:
host: web.default.svc.cluster.local
subsets:
- name: blue
labels:
track: blue
- name: green
labels:
track: green
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: web
spec:
hosts:
- web.example.com
http:
- route:
- destination:
host: web.default.svc.cluster.local
subset: blue
weight: 100To switch, you update the weights to send 100% to green. Rollback is the reverse edit.
Canary Releases
A canary release gradually shifts a small percentage of traffic to the new version while the majority continues to use the stable version. You observe metrics and logs, validate error rates and latency, and then increase the canary share step-by-step. If problems appear, you reduce canary traffic to zero and investigate. Canary releases reduce risk by limiting blast radius and are a strong default for production changes.
Canary is most effective when you have clear health signals and automated analysis: request success rate, p95 latency, saturation, and business KPIs. Without measurement, canary becomes guesswork. Another important point is that canary traffic should be representative; if only a narrow subset of users hits the canary, you may miss issues that appear under real load or specific user behaviors.
Canary step-by-step: weighted routing with a service mesh
- Step 1: Deploy v2 alongside v1. Ensure both versions are healthy and labeled (for example
version: v1,version: v2). - Step 2: Define subsets for v1 and v2. Subsets map labels to logical destinations.
- Step 3: Start with a small weight to v2. Common starting points are 1%, 5%, or 10% depending on risk tolerance.
- Step 4: Observe and compare. Track error rates, latency, and resource usage for both subsets.
- Step 5: Increase weight in increments. For example 5% → 25% → 50% → 100% over time.
- Step 6: Roll back quickly if needed. Set v2 weight to 0% while keeping the pods for debugging.
Istio example: 90/10 canary split
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: web
spec:
host: web.default.svc.cluster.local
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: web
spec:
hosts:
- web.example.com
http:
- route:
- destination:
host: web.default.svc.cluster.local
subset: v1
weight: 90
- destination:
host: web.default.svc.cluster.local
subset: v2
weight: 10To progress the canary, you update the weights. This change is typically safe to apply frequently and can be automated by a delivery controller or CI pipeline.
Practical canary checks you should run
When you shift traffic, validate both technical and product behavior. Technical checks include HTTP 5xx rate, timeouts, and p95/p99 latency. Product checks might include login success rate, checkout completion, or API contract correctness. If you have distributed tracing, compare traces for v1 and v2 to detect unexpected downstream calls or longer critical paths.
Also consider session behavior. If your application uses server-side sessions stored in memory, users may bounce between versions and lose session state during a canary. Prefer shared session stores or stateless auth tokens. If you must keep in-memory sessions, you will need sticky routing (cookie-based) during the canary, which changes the statistical properties of your rollout.
A/B Routing (Experimentation)
A/B routing is used to compare two (or more) variants under real user traffic to measure impact on user behavior or business metrics. Unlike canary, the goal is not primarily risk reduction; it is controlled experimentation. A/B routing typically requires deterministic assignment so that a given user consistently sees the same variant across requests and sessions. This avoids mixing experiences and contaminating metrics.
In Kubernetes and service mesh terms, A/B routing often uses request attributes such as cookies or headers to decide which subset receives the request. You might assign users to variant B by setting a cookie at the edge, or by having an upstream service include a header based on user ID hashing. The routing layer then uses that signal to send traffic to the correct version.
A/B step-by-step: cookie-based routing
- Step 1: Decide the experiment key. For example, cookie
experiment=variant-b. - Step 2: Ensure both variants are deployed. Label them as subsets (for example
variant: aandvariant: b). - Step 3: Configure routing rules. If the cookie matches, route to variant B; otherwise route to A.
- Step 4: Assign users to variants. This can be done by the application, an edge proxy, or a dedicated experimentation service.
- Step 5: Measure outcomes. Collect metrics tagged by variant and evaluate statistical significance.
- Step 6: End the experiment. Route all traffic to the winning variant and remove the experiment logic.
Istio example: route by cookie to variant B
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: web
spec:
host: web.default.svc.cluster.local
subsets:
- name: variant-a
labels:
variant: a
- name: variant-b
labels:
variant: b
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: web
spec:
hosts:
- web.example.com
http:
- match:
- headers:
cookie:
regex: ".*(^|;\s*)experiment=variant-b($|;).*"
route:
- destination:
host: web.default.svc.cluster.local
subset: variant-b
- route:
- destination:
host: web.default.svc.cluster.local
subset: variant-aThis rule is deterministic: any request with the cookie goes to variant B. Everyone else goes to A. You can combine this with a user assignment mechanism that sets the cookie for a percentage of users (for example 50/50) while keeping routing deterministic per user.
A/B routing by header for internal testing
Headers are convenient for internal QA and for automated tests because you can set them explicitly. For example, route requests with x-preview: true to the new variant while normal users continue to see the stable variant. This is not a full A/B experiment, but it uses the same routing technique.
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: web
spec:
hosts:
- web.example.com
http:
- match:
- headers:
x-preview:
exact: "true"
route:
- destination:
host: web.default.svc.cluster.local
subset: variant-b
- route:
- destination:
host: web.default.svc.cluster.local
subset: variant-aChoosing Between Blue/Green, Canary, and A/B
These strategies solve different problems. Blue/Green is a controlled cutover with fast rollback, best when you want a clean switch and have high confidence after validation. Canary is progressive delivery for risk reduction, best when you want to detect issues early under real traffic and gradually increase exposure. A/B routing is experimentation, best when you want to compare variants and measure user impact with deterministic assignment.
In real systems, you often combine them. For example, you can run a canary rollout (5% v2) while also running an A/B experiment within v2 (variant B vs variant A) for a specific feature flag. The key is to keep routing rules understandable and to avoid overlapping conditions that make traffic behavior unpredictable.
Operational Pitfalls and How to Avoid Them
Overlapping matches and rule ordering
Routing configurations are evaluated in order. If you have multiple match blocks, the first match wins. A common mistake is to define a broad match (for example, all traffic to v1) before a narrow match (for example, cookie routes to v2). Always place the most specific matches first, then fall back to default routes.
Version skew and backward compatibility
During canary and A/B, multiple versions run simultaneously. That means your APIs, database schema, and message formats must tolerate version skew. Prefer backward-compatible changes: add fields instead of removing, support both old and new formats for a period, and avoid destructive schema migrations until all traffic is on the new version.
Sticky behavior and caching layers
CDNs, browser caches, and intermediate proxies can affect experiments. If variant selection depends on cookies, ensure caching does not serve variant A content to variant B users (and vice versa). For dynamic pages, set appropriate cache headers. For APIs, avoid caching responses that differ by experiment cookie unless caches vary on that cookie.
Observability by subset
Traffic management without visibility is risky. Ensure you can break down metrics by version or subset (v1 vs v2, variant-a vs variant-b). At minimum, you want request rate, error rate, and latency per subset. If you can tag logs and traces with version labels, debugging becomes much faster because you can correlate errors with the exact rollout stage.
Rollback mechanics
Define rollback as a first-class operation. For Blue/Green, rollback is switching traffic back to Blue. For canary, rollback is setting canary weight to 0%. For A/B, rollback might mean routing everyone to the control variant and disabling assignment. Practice these operations in a staging environment so that the team can execute them quickly under pressure.
Hands-On Workflow: A Safe Progressive Delivery Runbook
This runbook illustrates a practical sequence you can adapt to your environment when releasing a new web version with a canary strategy and optional A/B routing for a feature.
Step 1: Deploy the new version with distinct labels
kubectl apply -f web-v2-deployment.yaml
kubectl get pods -l app=web -L versionStep 2: Create or update subsets and start at 99/1
kubectl apply -f destinationrule-web.yaml
kubectl apply -f virtualservice-web-99-1.yamlStep 3: Verify routing behavior with repeated requests
Because weighted routing is probabilistic, you should send many requests and check the distribution. You can do this from a test pod inside the cluster or from a controlled client. Ensure your responses include a version marker (for example an HTTP header like x-app-version) so you can confirm which subset served the request.
# Example: run 200 requests and count versions (pseudo-shell)
for i in $(seq 1 200); do curl -sI https://web.example.com | grep -i x-app-version; done | sort | uniq -cStep 4: Observe metrics for a fixed window
Hold the canary at a low percentage long enough to gather meaningful signals. The right window depends on traffic volume; high-traffic services can detect issues in minutes, while low-traffic services may need hours. Compare v2 against v1 rather than looking at absolute values only.
Step 5: Increase weights gradually
# Update VirtualService weights to 95/5, then 75/25, then 50/50, then 0/100
kubectl apply -f virtualservice-web-95-5.yamlStep 6 (optional): Add deterministic A/B routing inside the canary
If you want to test a UI change or a behavior change, you can route a subset of users to a variant using a cookie. Keep the experiment scope narrow and ensure your metrics can be segmented by variant. Avoid running too many experiments at once during a rollout, because it complicates attribution when something goes wrong.
kubectl apply -f virtualservice-web-ab-cookie.yamlStep 7: Promote to 100% and clean up
After v2 is stable at 100%, you can scale down or remove v1. For A/B, end the experiment, route all users to the chosen variant, and remove the cookie match rules. Keeping old routing rules around increases cognitive load and the chance of unexpected behavior later.