DNS and Service Discovery
Understand how Kubernetes DNS works and how services discover each other.
DNS and Service Discovery
In the previous tutorial, we explored service meshes — a whole infrastructure layer for managing service communication. But let's zoom back to something fundamental: when a pod needs to talk to another service, how does it actually find it?
You've been using service names like backend throughout this series. But have you wondered how curl http://backend actually works inside a cluster? There's no magic — it's DNS. Kubernetes has its own built-in DNS system, and understanding it is the key to debugging those "connection refused" mysteries that will inevitably haunt you.
How Kubernetes DNS Works
Every Kubernetes cluster runs a DNS server called CoreDNS that watches for new Services and creates DNS records for them. It's like an automatic phone book that updates itself.
Pod A wants to reach "backend"
│
▼
┌───────────────┐
│ CoreDNS │ ← Resolves "backend" to 10.96.45.123
└───────────────┘
│
▼
Pod A connects to 10.96.45.123 (Service IP)
│
▼
Service routes to backend pods
DNS Record Types
Kubernetes creates several types of DNS records. Here's the lineup:
A/AAAA Records (Services)
The bread and butter — maps service names to ClusterIPs:
<service-name>.<namespace>.svc.cluster.local → Service ClusterIP
Example:
backend.default.svc.cluster.local → 10.96.45.123
SRV Records (Ports)
_<port-name>._<protocol>.<service>.<namespace>.svc.cluster.local
Contains port number and target hostname.
Pod DNS Records
<pod-ip-dashed>.<namespace>.pod.cluster.local
Example (pod IP 10.244.1.5):
10-244-1-5.default.pod.cluster.local → 10.244.1.5
Test DNS Resolution
Let's actually see this in action. Spin up a debug pod:
kubectl run debug --rm -it --image=busybox -- /bin/sh
Inside the pod:
# Full DNS name
nslookup kubernetes.default.svc.cluster.local
# Short name (same namespace)
nslookup kubernetes
# Check DNS server
cat /etc/resolv.conf
Output of /etc/resolv.conf — this is where the magic lives:
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
That search line is why you can just type backend instead of the full backend.default.svc.cluster.local. Kubernetes tries all those suffixes for you. How cool is that?
DNS Name Resolution Rules
Here's the cheat sheet for when to use what.
Same Namespace
Just use the service name. Keep it simple:
curl http://backend
curl http://backend:8080
Different Namespace
Include the namespace:
curl http://backend.production
curl http://backend.production.svc.cluster.local
Full Qualified Domain Name (FQDN)
curl http://backend.production.svc.cluster.local
The full address. Always works, regardless of which namespace you're in. Like using someone's full mailing address instead of just their first name.
The Search Path
"But wait, how does Kubernetes know what I mean when I just say 'backend'?"
Great question! Remember that search line in /etc/resolv.conf?
search default.svc.cluster.local svc.cluster.local cluster.local
When you resolve backend, Kubernetes tries all these in order:
backend.default.svc.cluster.local← usually finds it herebackend.svc.cluster.localbackend.cluster.localbackend(falls through to external DNS)
It's like calling someone — first it tries the most likely option and works its way out.
ndots Setting (The Performance Trap)
options ndots:5
If a name has fewer than 5 dots, Kubernetes searches the entire search path first before trying external DNS. A name like backend (0 dots) uses the search path. A name like api.external.com (2 dots) also uses the search path first — which is wasteful.
To skip the search path and go directly to external DNS, append a trailing dot:
curl http://api.external.com.
That trailing dot is a pro tip that will make your external DNS lookups way faster.
Service Discovery Patterns
There are two ways services can discover each other. One is old-school, one is the right way.
Environment Variables (The Old Way)
Kubernetes injects service information as environment variables:
kubectl exec <pod-name> -- env | grep -i backend
BACKEND_SERVICE_HOST=10.96.45.123
BACKEND_SERVICE_PORT=80
BACKEND_PORT=tcp://10.96.45.123:80
BACKEND_PORT_80_TCP=tcp://10.96.45.123:80
BACKEND_PORT_80_TCP_PROTO=tcp
BACKEND_PORT_80_TCP_PORT=80
BACKEND_PORT_80_TCP_ADDR=10.96.45.123
Big limitation: Environment variables are set when the pod starts. If the service is created after the pod, the variables won't exist. Oops.
DNS (The Right Way)
DNS is dynamic — it always returns the current service IP:
import requests
response = requests.get("http://backend:8080/api")
DNS is the preferred method for service discovery. Always use DNS. Always.
Headless Services
"What if I need to know the IP addresses of individual pods, not just the service?"
That's what headless services are for. Normal services return the ClusterIP. Headless services return the pod IPs directly.
Create a headless service (clusterIP: None is the magic):
apiVersion: v1
kind: Service
metadata:
name: backend-headless
spec:
clusterIP: None
selector:
app: backend
ports:
- port: 80
DNS lookup returns all pod IPs:
nslookup backend-headless
Name: backend-headless.default.svc.cluster.local
Address: 10.244.1.5
Address: 10.244.2.8
Address: 10.244.3.12
Use cases:
- StatefulSets (each pod needs a unique identity — which we'll cover next!)
- Client-side load balancing (you pick which pod to talk to)
- Database clusters (need to know exactly which replica to connect to)
StatefulSet DNS
StatefulSets with headless services get predictable DNS names. This is huge for databases:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: db
spec:
serviceName: db-headless # Links to headless service
replicas: 3
selector:
matchLabels:
app: db
template:
metadata:
labels:
app: db
spec:
containers:
- name: db
image: postgres
ports:
- containerPort: 5432
Each pod gets a stable DNS name:
db-0.db-headless.default.svc.cluster.local
db-1.db-headless.default.svc.cluster.local
db-2.db-headless.default.svc.cluster.local
Connect to a specific replica:
psql -h db-0.db-headless
You can talk to any specific pod by name. No guessing, no randomness. Beautiful.
ExternalName Services
"Can I use Kubernetes DNS for external services too?"
Yep! Map a service name to an external DNS name:
apiVersion: v1
kind: Service
metadata:
name: external-api
spec:
type: ExternalName
externalName: api.external-provider.com
Pods can use external-api as the hostname:
curl http://external-api/endpoint
Kubernetes DNS returns a CNAME record pointing to api.external-provider.com. Your app doesn't need to know it's talking to something outside the cluster. Sneaky and clean.
CoreDNS Configuration
Okay, for the curious (or the debugging-at-3am crowd), here's what's under the hood.
CoreDNS runs in the kube-system namespace:
kubectl get pods -n kube-system -l k8s-app=kube-dns
View the configuration:
kubectl get configmap coredns -n kube-system -o yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
Custom DNS Entries
Add custom DNS records by editing the ConfigMap:
data:
Corefile: |
.:53 {
# ... existing config ...
}
custom.local:53 {
errors
cache 30
forward . 10.0.0.53
}
Stub Domains
Forward specific domains to custom DNS servers:
data:
Corefile: |
.:53 {
# ... existing config ...
forward . /etc/resolv.conf {
except mycompany.local
}
}
mycompany.local:53 {
forward . 10.0.0.53
}
DNS Troubleshooting
DNS issues are responsible for approximately 147% of Kubernetes debugging sessions (okay, maybe not, but it sure feels like it).
Check if DNS is Working
kubectl run debug --rm -it --image=busybox -- nslookup kubernetes
Expected output:
Server: 10.96.0.10
Address: 10.96.0.10:53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
DNS Not Resolving
-
Check CoreDNS is running:
kubectl get pods -n kube-system -l k8s-app=kube-dns -
Check CoreDNS logs:
kubectl logs -n kube-system -l k8s-app=kube-dns -
Verify service exists:
kubectl get svc <service-name> -
Check endpoints:
kubectl get endpoints <service-name>
DNS is Slow
"Everything works, but DNS lookups are taking forever."
That ndots:5 setting we mentioned earlier is probably the culprit. For external domains, use FQDN with trailing dot:
# Slow (tries search path first)
curl http://api.external.com
# Fast (skips search path)
curl http://api.external.com.
Or set custom DNS config on the pod:
spec:
dnsConfig:
options:
- name: ndots
value: "2"
Pod DNS Policy
Control how pods resolve DNS:
spec:
dnsPolicy: ClusterFirst # Default: use cluster DNS
| Policy | Behavior |
|---|---|
ClusterFirst | Use cluster DNS, fall back to node DNS |
Default | Use node's DNS settings |
ClusterFirstWithHostNet | For pods with hostNetwork: true |
None | Use custom dnsConfig only |
Debugging DNS with dig
For more detailed DNS debugging, use a pod with dig:
kubectl run debug --rm -it --image=tutum/dnsutils -- /bin/bash
# Query A record
dig backend.default.svc.cluster.local
# Query SRV record
dig SRV _http._tcp.backend.default.svc.cluster.local
# Trace resolution
dig +trace backend.default.svc.cluster.local
DNS Caching
CoreDNS caches responses (default 30 seconds). This means:
- DNS changes take up to 30 seconds to propagate
- Failed lookups are also cached (which can be annoying)
View cache TTL in Corefile:
cache 30
⚠️ Applications may also cache DNS on their own. Java, for example, caches DNS indefinitely by default. Yes, forever. If your Java app can't find a service that clearly exists, check its DNS cache settings. We've all been there.
Practical Example: Multi-Tier App
Let's see how DNS ties a real application together:
# Frontend deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: myapp
spec:
replicas: 2
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: frontend
image: myapp-frontend:v1
env:
- name: API_URL
value: "http://backend:8080" # DNS-based discovery
---
# Backend deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
namespace: myapp
spec:
replicas: 3
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: backend
image: myapp-backend:v1
env:
- name: DATABASE_HOST
value: "postgres.database" # Service in different namespace
---
apiVersion: v1
kind: Service
metadata:
name: backend
namespace: myapp
spec:
selector:
app: backend
ports:
- port: 8080
Frontend finds backend via http://backend:8080 (same namespace, just the name). Backend finds database via postgres.database (service in database namespace, need to include the namespace). All DNS. No hardcoded IPs anywhere.
What's Next?
Awesome work! You now understand the phone book of Kubernetes — how services find each other, how DNS resolution works, and how to debug it when things go sideways.
But what about applications that need stable identities and persistent storage — like databases? Just using a Deployment won't cut it. In the next tutorial, we'll dive into StatefulSets — the right way to run stateful applications in Kubernetes. Let's go!