Free Ebook cover Cloud-Native Web Serving with Kubernetes Ingress and Service Mesh

Cloud-Native Web Serving with Kubernetes Ingress and Service Mesh

New course

20 pages

Kubernetes Services for Stable Networking and Load Distribution

Capítulo 4

Estimated reading time: 0 minutes

+ Exercise

Why Kubernetes Services Matter for Stable Networking

Kubernetes Pods are designed to be ephemeral: they can be rescheduled, replaced during rollouts, or scaled up and down at any time. Each Pod gets its own IP address, but that IP is not stable over the lifetime of an application. If clients had to track Pod IPs directly, every reschedule or scaling event would break connectivity. A Kubernetes Service solves this by providing a stable virtual IP and DNS name that represents a dynamic set of Pods. Clients talk to the Service, and the Service routes traffic to healthy backend Pods that match its selector.

In practice, a Service is both a naming abstraction and a load distribution mechanism. It creates a consistent endpoint (for example, http://web.default.svc.cluster.local) while allowing the backing Pods to change. Under the hood, Kubernetes programs the node networking (commonly via iptables or IPVS) so that connections to the Service’s virtual IP are forwarded to one of the selected Pod IPs. This is the foundation for stable in-cluster communication and for exposing workloads outside the cluster through higher-level components.

Core Concepts: Labels, Selectors, and Endpoints

A Service typically selects Pods using label selectors. You label Pods (or, more commonly, a Deployment template) with key-value pairs such as app=web. The Service selector matches those labels and defines which Pods are eligible backends. This decouples “who provides the service” from “who consumes it,” enabling safe rollouts and scaling without changing client configuration.

The actual list of backend IPs is represented by Endpoints (or EndpointSlices in modern clusters). You rarely create these objects manually; Kubernetes controllers maintain them based on the Service selector and Pod readiness. When a Pod becomes Ready, it is added as a backend; when it becomes NotReady or is deleted, it is removed. This readiness gating is essential: it prevents traffic from being sent to Pods that are starting up, failing health checks, or draining during termination.

Practical mental model

  • Pods are the “workers” with changing IPs.
  • Labels identify which Pods belong to a logical application.
  • A Service is the stable “front door” with a stable name and virtual IP.
  • Endpoints/EndpointSlices are the live backend list derived from Ready Pods.

Service Types and When to Use Them

Kubernetes offers several Service types. Choosing the right one depends on whether you need in-cluster access only, node-level access, or cloud-provider load balancers. Even if you later use an Ingress or a service mesh, understanding these types is critical because those higher-level tools often build on Services.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

ClusterIP: Default for In-Cluster Access

ClusterIP is the default type. It creates a virtual IP reachable only within the cluster network. Use it for internal APIs, backends, databases (when appropriate), and any service-to-service communication where external clients should not connect directly.

ClusterIP is also the most common “anchor” for other exposure mechanisms. For example, an Ingress controller typically routes to a ClusterIP Service, and many mesh gateways also target Services rather than Pods directly.

NodePort: Expose a Port on Every Node

NodePort allocates a port from a configured range (often 30000–32767) and opens it on every node. Traffic to nodeIP:nodePort is forwarded to the Service backends. NodePort is useful for simple environments, debugging, or when you have an external load balancer that can target node ports. It is less ideal for direct public exposure because it couples clients to node addresses and a high port number.

LoadBalancer: Integrate with a Cloud Load Balancer

LoadBalancer asks the cloud provider (or an in-cluster load balancer implementation) to provision an external load balancer and route traffic to the Service. This is a common way to expose a service publicly without manually managing node ports. In many managed Kubernetes offerings, this is the standard approach for external TCP/UDP services.

ExternalName: DNS Alias to an External Service

ExternalName maps a Service name to an external DNS name. It does not create a virtual IP or proxy traffic; it simply returns a CNAME record. Use it when you want in-cluster clients to use a consistent Kubernetes DNS name for something that lives outside the cluster, such as a managed database endpoint.

Headless Services: Direct Pod Discovery

A headless Service is created by setting clusterIP: None. Instead of a single virtual IP, DNS returns the individual Pod IPs (often as multiple A records). This is useful for stateful systems that need stable identity or client-side load balancing, such as StatefulSets, databases, or systems that use leader election and need to address specific peers.

Step-by-Step: Create a ClusterIP Service for a Deployment

This walkthrough focuses on the mechanics of stable networking and load distribution. The goal is to deploy multiple replicas and access them through a single stable Service name.

Step 1: Create a Deployment with labels

Ensure your Pod template has labels that will be used by the Service selector. The labels must match exactly.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: nginx:1.27
        ports:
        - containerPort: 80

Step 2: Create a ClusterIP Service that selects those Pods

The Service selector matches app: web. The Service port is what clients use; the targetPort is the container port. If you omit targetPort and it matches, Kubernetes uses the same number as the Service port.

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  type: ClusterIP
  selector:
    app: web
  ports:
  - name: http
    port: 80
    targetPort: 80

Step 3: Verify Service discovery and endpoints

After applying the manifests, verify that the Service exists and that it has backends. The most important check is that endpoints are populated; if not, the selector likely does not match Pod labels, or Pods are not Ready.

kubectl get svc web
kubectl describe svc web
kubectl get endpoints web
kubectl get endpointslices -l kubernetes.io/service-name=web

To test DNS and connectivity from inside the cluster, run a temporary Pod and curl the Service name. You should see responses even as Pods are rescheduled, because the Service name stays stable.

kubectl run -it --rm tmp --image=curlimages/curl:8.10.1 --restart=Never -- sh
curl -sS http://web

How Load Distribution Works (and What It Is Not)

Kubernetes Service load distribution is connection-based rather than request-based for most common protocols. When a client opens a TCP connection to the Service IP, the node’s networking rules choose a backend Pod and forward the connection. Subsequent packets in that connection continue to go to the same Pod. If you use HTTP keep-alive, many requests may share one TCP connection and therefore hit the same backend for a while. This is normal and often desirable, but it means you should not expect perfect round-robin per HTTP request.

Also note that Services do not perform application-layer health checks. They rely on Pod readiness. If your readiness probe is too permissive, traffic may reach Pods that are not actually ready to serve real requests. If your readiness probe is too strict, you may reduce available capacity unnecessarily. The Service is only as good as the readiness signals it receives.

Practical tip: verify readiness gating

If you want traffic to stop reaching a Pod before it terminates, configure a readiness probe and a termination grace period. When a Pod begins termination, Kubernetes will typically remove it from endpoints quickly, but existing connections may continue until they close. For long-lived connections, consider application-level draining behavior.

Ports, targetPort, and Named Ports

Services map a stable port to a backend port. The Service port is what clients use. The targetPort is what the Pod listens on. You can set targetPort to a number or to a named port defined in the container spec. Named ports reduce duplication and make refactoring safer.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
      - name: api
        image: myorg/api:1.0.0
        ports:
        - name: http
          containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: api
spec:
  selector:
    app: api
  ports:
  - name: http
    port: 80
    targetPort: http

With this pattern, the Service always targets the container’s http port, even if you later change the numeric container port from 8080 to something else.

Session Affinity and Client IP Stickiness

By default, a Service distributes connections across backends. Sometimes you need a client to consistently reach the same backend, for example when an application stores in-memory session state (not ideal, but common in legacy systems). Kubernetes supports basic session affinity based on client IP.

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  ports:
  - port: 80
    targetPort: 80

This setting is coarse: it uses the source IP as the key. In environments with NAT or proxies, many users may share the same apparent source IP, leading to uneven load. Prefer stateless services or external/session stores when possible, but it is useful to know how to enable stickiness when you must.

Traffic Policies: Where Backends Are Chosen

For Services that are reachable from outside the cluster (NodePort and LoadBalancer), Kubernetes offers traffic policies that influence whether traffic is routed only to local node backends or to any backend in the cluster.

externalTrafficPolicy: Cluster vs Local

externalTrafficPolicy: Cluster (default) allows a node to accept external traffic and forward it to any backend Pod in the cluster, even on other nodes. This can improve load spreading but may hide the original client IP (depending on environment) and can add an extra hop.

externalTrafficPolicy: Local routes external traffic only to backends on the same node that received the traffic. This can preserve client source IP in many setups and avoid cross-node hops, but it requires that enough nodes have local backends; otherwise some nodes may have no eligible Pods and will drop traffic.

apiVersion: v1
kind: Service
metadata:
  name: web-public
spec:
  type: LoadBalancer
  selector:
    app: web
  externalTrafficPolicy: Local
  ports:
  - port: 80
    targetPort: 80

internalTrafficPolicy: Prefer local backends for in-cluster calls

internalTrafficPolicy can influence in-cluster routing similarly. When set to Local, traffic from a node prefers backends on the same node. This can reduce cross-node traffic for chatty services, but it can also reduce effective load balancing if replicas are unevenly distributed.

Dual-Stack and IP Families (IPv4/IPv6)

Many clusters are IPv4-only, but Kubernetes Services can be configured for IPv6-only or dual-stack, depending on cluster networking. Service fields like ipFamilies and ipFamilyPolicy control whether a Service gets an IPv4 address, an IPv6 address, or both. This matters when you run workloads that must be reachable over IPv6 internally or externally, or when you are migrating networks.

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
  ipFamilyPolicy: PreferDualStack
  ipFamilies:
  - IPv4
  - IPv6
  ports:
  - port: 80
    targetPort: 80

Whether this works depends on the cluster CNI and control plane configuration. If dual-stack is not enabled, Kubernetes will fall back according to the policy or reject the configuration.

Headless Services for Stateful Workloads and Direct Pod Addressing

Headless Services are a deliberate choice when you do not want a single virtual IP and kube-proxy load distribution. Instead, you want DNS to return the set of Pod IPs so the client can choose, or so each Pod can be addressed individually. This is common with StatefulSets where each Pod has a stable identity like db-0, db-1, and clients may need to connect to a specific member.

apiVersion: v1
kind: Service
metadata:
  name: db
spec:
  clusterIP: None
  selector:
    app: db
  ports:
  - name: tcp
    port: 5432
    targetPort: 5432

With a StatefulSet named db, DNS records can include per-Pod names such as db-0.db.default.svc.cluster.local. This enables peer discovery patterns without relying on a separate service registry.

ExternalName and Bridging to Managed Services

Sometimes the best “service” is not running in Kubernetes at all. You might use a managed database, an external SaaS API, or a legacy system in another network. ExternalName lets you give that dependency a Kubernetes-native name so application configuration stays consistent across environments.

apiVersion: v1
kind: Service
metadata:
  name: payments-api
spec:
  type: ExternalName
  externalName: api.payments.example.com

In-cluster clients can connect to payments-api and DNS will resolve it to the external hostname. Because it is DNS-only, you cannot attach Service ports or rely on kube-proxy behavior; it is purely a naming convenience.

Troubleshooting Service Connectivity: A Practical Checklist

When a Service does not route traffic as expected, the fastest path is to validate each layer: selector, endpoints, readiness, ports, and DNS. Most Service issues are simple mismatches or missing readiness.

Step 1: Confirm labels and selector match

kubectl get pods -l app=web --show-labels
kubectl get svc web -o yaml

If the Service selector does not match any Pods, endpoints will be empty and traffic will fail.

Step 2: Check endpoints and readiness

kubectl get endpoints web -o yaml
kubectl describe pod -l app=web

Look for Ready conditions and readiness probe failures. A Pod can be Running but not Ready, and it will not receive Service traffic.

Step 3: Validate port mapping

kubectl describe svc web

Ensure port and targetPort align with container ports. Named ports must match exactly, including case.

Step 4: Test DNS and connectivity from a debug Pod

kubectl run -it --rm netdebug --image=busybox:1.36 --restart=Never -- sh
nslookup web
wget -qO- http://web:80

If DNS fails, investigate CoreDNS and namespace spelling. If DNS works but connection fails, focus on endpoints, readiness, and network policies.

Step 5: Consider NetworkPolicy and firewall rules

If your cluster uses NetworkPolicy, a Service does not bypass it. Policies are enforced at the Pod level, so traffic may be blocked even though the Service and endpoints look correct. In that case, verify that the source namespace/labels are allowed to reach the destination Pods on the required ports.

Design Patterns for Services in Real Applications

Services become easier to manage when you apply consistent patterns. A common approach is to create one ClusterIP Service per workload that needs stable in-cluster access, and then layer external exposure on top only where needed. This keeps internal topology stable and reduces the blast radius of changes.

Pattern: Separate internal and external Services

You can create an internal ClusterIP Service for in-cluster callers and a separate LoadBalancer or NodePort Service for external clients, both selecting the same Pods. This allows different ports, annotations, and traffic policies without affecting internal consumers.

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
  ports:
  - name: http
    port: 80
    targetPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: web-public
spec:
  type: LoadBalancer
  selector:
    app: web
  externalTrafficPolicy: Cluster
  ports:
  - name: http
    port: 80
    targetPort: 80

Pattern: Use headless for peer discovery, ClusterIP for clients

For stateful systems, you may use a headless Service to let members discover each other, and a separate ClusterIP Service to provide a single stable endpoint for clients that do not care which replica they hit (for example, read-only traffic or a proxy layer).

Pattern: Keep selectors tight and intentional

A Service selector that is too broad can accidentally include Pods from another version or another component, especially during migrations. Use labels like app and component, and consider adding tier or role labels. The goal is to make it hard for unrelated Pods to match by accident.

Now answer the exercise about the content:

Why does a Kubernetes Service provide stable networking even when Pods are rescheduled or scaled?

You are right! Congratulations, now go to the next page

You missed! Try again.

A Service gives clients a stable virtual IP and DNS name, while Kubernetes updates the backend Endpoints based on label selectors and Pod readiness. This keeps connectivity stable even as Pod IPs change.

Next chapter

Ingress Controllers for HTTP Routing, Hostnames, and Path Rules

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.