StatefulSets
Deploy stateful applications like databases with stable identities and ordered deployment.
StatefulSets
In the previous tutorial, we learned all about DNS and service discovery — how services find each other inside the cluster. Now let's tackle a big challenge: running stateful applications.
Deployments work great for stateless apps where any pod can handle any request. But databases? They need to remember things. They need stable identities, persistent storage, and ordered startup. You can't just kill a database pod and hope a random replacement picks up where it left off. That's chaos.
That's what StatefulSets are for. Think of the difference like this: Deployments treat pods like identical workers in a call center — any one can take any call. StatefulSets treat pods like named employees with assigned desks and personal filing cabinets.
StatefulSet vs Deployment
Let's make this crystal clear:
| Feature | Deployment | StatefulSet |
|---|---|---|
| Pod names | Random (nginx-abc123) | Ordered (nginx-0, nginx-1) |
| Pod identity | Interchangeable | Stable, unique |
| Storage | Shared or ephemeral | Per-pod persistent |
| Scaling | All at once | One at a time, ordered |
| Startup order | Parallel | Sequential (0, then 1, then 2) |
| Deletion order | Random | Reverse sequential (2, then 1, then 0) |
Use StatefulSets for:
- Databases (PostgreSQL, MySQL, MongoDB)
- Message queues (Kafka, RabbitMQ)
- Distributed systems (Elasticsearch, Zookeeper)
Basically, anything that says "I need to remember who I am."
StatefulSet Requirements
"Can I just create a StatefulSet like a Deployment?"
Almost — but StatefulSets need a headless service for network identity. Remember headless services from the DNS tutorial?
apiVersion: v1
kind: Service
metadata:
name: postgres-headless
spec:
clusterIP: None # Headless
selector:
app: postgres
ports:
- port: 5432
Create a StatefulSet
Here's a PostgreSQL StatefulSet — notice the differences from a Deployment:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres-headless # Links to headless service
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
Key differences from Deployment:
serviceNamelinks to the headless service (required!)volumeClaimTemplatescreates a unique PVC per pod (not shared)
This is the magic — each pod gets its own dedicated storage. No sharing, no conflicts.
Stable Pod Identity
Apply and watch the pods come up:
kubectl apply -f postgres-statefulset.yaml
kubectl get pods -w
NAME READY STATUS RESTARTS AGE
postgres-0 1/1 Running 0 30s
postgres-1 1/1 Running 0 25s
postgres-2 1/1 Running 0 20s
Pods are created sequentially: 0, then 1, then 2. Names are predictable. Not postgres-abc123 but postgres-0. Always. Even if you delete it and it comes back — still postgres-0. How cool is that?
Stable Network Identity
Each pod gets its own DNS name that persists across restarts:
<pod-name>.<service-name>.<namespace>.svc.cluster.local
For our example:
postgres-0.postgres-headless.default.svc.cluster.local
postgres-1.postgres-headless.default.svc.cluster.local
postgres-2.postgres-headless.default.svc.cluster.local
Test DNS resolution:
kubectl run debug --rm -it --image=busybox -- nslookup postgres-0.postgres-headless
Stable Storage
"What happens to the data when a pod dies?"
Each pod gets its own PVC:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES AGE
data-postgres-0 Bound pvc-xxx 10Gi RWO 5m
data-postgres-1 Bound pvc-yyy 10Gi RWO 4m
data-postgres-2 Bound pvc-zzz 10Gi RWO 3m
If a pod is deleted and recreated, it reattaches to the same PVC. Data persists. The pod might die, but its filing cabinet stays right where it was.
Pod Management Policy
OrderedReady (Default)
Pods start/stop one at a time, each waiting for the previous to be Ready. Like a roll call:
spec:
podManagementPolicy: OrderedReady
- Scale up: 0 → 1 → 2 (waits for Ready between each)
- Scale down: 2 → 1 → 0 (reverse order, like a stack)
Parallel
Start/stop all pods simultaneously (like a Deployment). Use when pods don't depend on each other's startup order:
spec:
podManagementPolicy: Parallel
Use when pods don't depend on each other's startup order.
Update Strategies
RollingUpdate (Default)
Update pods one at a time, in reverse order:
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 0 # Update all pods
- Updates: 2 → 1 → 0
- Each pod must be Ready before updating the next
Partition Updates (Canary)
"Can I update just one pod to test before rolling out to all?"
Yes! Only update pods with index >= partition:
spec:
updateStrategy:
rollingUpdate:
partition: 2 # Only update pod 2
Set partition: 2 → only postgres-2 updates.
Set partition: 1 → postgres-2 and postgres-1 update.
Set partition: 0 → all pods update.
Useful for canary deployments of database updates. Test on one replica before risking your primary. Smart.
OnDelete
Only updates when a pod is manually deleted — maximum control:
spec:
updateStrategy:
type: OnDelete
Scaling StatefulSets
Scale up:
kubectl scale statefulset postgres --replicas=5
Pods are added sequentially: postgres-3, then postgres-4.
Scale down:
kubectl scale statefulset postgres --replicas=2
Pods are removed in reverse: postgres-4, then postgres-3, then postgres-2.
Important: Scaling down does NOT delete PVCs. Data is preserved. Kubernetes is cautious about deleting your data. Good.
Real-World Example: PostgreSQL with Replication
Okay, let's build something real. A primary-replica PostgreSQL setup where postgres-0 is the primary and the rest are read replicas:
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-config
data:
primary.conf: |
wal_level = replica
max_wal_senders = 3
synchronous_commit = on
replica.conf: |
primary_conninfo = 'host=postgres-0.postgres-headless port=5432 user=replicator password=replpass'
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres-headless
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command:
- bash
- -c
- |
if [[ $POD_NAME == "postgres-0" ]]; then
# Primary
docker-entrypoint.sh postgres -c config_file=/etc/postgres/primary.conf
else
# Replica
pg_basebackup -h postgres-0.postgres-headless -U replicator -D /var/lib/postgresql/data -Fp -Xs -R
docker-entrypoint.sh postgres -c config_file=/etc/postgres/replica.conf
fi
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
- name: config
mountPath: /etc/postgres
volumes:
- name: config
configMap:
name: postgres-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
name: postgres-headless
spec:
clusterIP: None
selector:
app: postgres
ports:
- port: 5432
---
# Service for primary only (writes)
apiVersion: v1
kind: Service
metadata:
name: postgres-primary
spec:
selector:
app: postgres
statefulset.kubernetes.io/pod-name: postgres-0
ports:
- port: 5432
---
# Service for replicas (reads)
apiVersion: v1
kind: Service
metadata:
name: postgres-replicas
spec:
selector:
app: postgres
ports:
- port: 5432
Applications connect to:
postgres-primaryfor writes (always goes topostgres-0)postgres-replicasfor reads (load balanced across all replicas)
That's a real database cluster running in Kubernetes with read/write splitting. Pretty slick, right?
Deleting StatefulSets
Here's something important to know:
Delete the StatefulSet:
kubectl delete statefulset postgres
PVCs are NOT deleted — data is preserved. Kubernetes doesn't throw away your data just because you deleted the StatefulSet. You have to explicitly delete PVCs:
kubectl delete pvc -l app=postgres
This is a safety feature. Appreciate it.
Delete with Cascade
Delete StatefulSet and pods, but not PVCs:
kubectl delete statefulset postgres --cascade=foreground
Troubleshooting
StatefulSets can be finicky. Here's your debugging playbook.
Pod Stuck in Pending
Usually a storage issue. Check PVC:
kubectl get pvc
kubectl describe pvc data-postgres-0
Common issues:
- No available PersistentVolume (did you set up a StorageClass?)
- StorageClass doesn't exist (typo, maybe?)
- Insufficient storage on the node
Pod Won't Start
kubectl describe pod postgres-0
kubectl logs postgres-0
Check:
- Init containers completed
- Volume mounted correctly
- Application configuration
Pods Not Scaling
kubectl describe statefulset postgres
With OrderedReady, each pod must be Ready before the next starts. If postgres-1 isn't healthy, postgres-2 will never start. Check the stuck pod first.
When NOT to Use StatefulSets
"Should I use a StatefulSet for everything?"
Nope! StatefulSets add complexity. Use them only when you actually need stable identity and per-pod storage.
- Stateless applications? Use Deployment
- All pods can share the same storage? Use Deployment + PVC
- Ordering doesn't matter? Use Deployment
Don't reach for StatefulSets just because it sounds fancier. Use the simplest tool that solves your problem.
Clean Up
kubectl delete statefulset postgres
kubectl delete svc postgres-headless postgres-primary postgres-replicas
kubectl delete pvc -l app=postgres
kubectl delete configmap postgres-config
kubectl delete secret postgres-secret
What's Next?
Awesome work! You now know how to run stateful applications in Kubernetes with stable identities, predictable DNS names, and per-pod persistent storage. Databases, message queues, distributed systems — you've got them covered.
But what about running a pod on every node in your cluster? Like log collectors, monitoring agents, or security tools? That's where DaemonSets come in. Let's go!