Introduction
In Kubernetes, not all workloads behave the same way.
Some applications need:
- Persistent storage
- Stable identity
- Ordered startup
Others need to run on every node in the cluster.
This is why Kubernetes provides different controllers like StatefulSet and DaemonSet.
Choosing the wrong one can lead to:
- Data loss
- Missing logs
- Unstable applications
Understanding these two properly helps avoid most production mistakes.
1. What is a StatefulSet?
A StatefulSet is used for applications that require:
- Stable identity
- Persistent storage
- Ordered deployment
Unlike Deployments, Pods in a StatefulSet are not interchangeable.
Key characteristics
- Each Pod has a fixed name
mysql-0
mysql-1
mysql-2
Each Pod has its own storage
- Pods are created and deleted in order
- Pod identity remains same after restart
Real use cases
StatefulSet is commonly used for:
- Databases (MySQL, PostgreSQL, MongoDB)
- Messaging systems (Kafka)
- Distributed systems (Elasticsearch, ZooKeeper)
2. How StatefulSet works internally
StatefulSet enforces strict rules around identity, networking, and storage.
Stable network identity
Each Pod gets a predictable DNS name:
<pod-name>.<service-name>.<namespace>.svc.cluster.local
Example:
mysql-0.mysql.default.svc.cluster.local
This allows applications to communicate with a specific Pod directly.
Headless Service requirement
StatefulSet requires a Headless Service:
clusterIP: None
Why this matters:
- Normal services load-balance traffic
- Stateful apps need direct access to specific Pods
Storage behavior (very important)
Each Pod gets its own PersistentVolume using:
volumeClaimTemplates
Important points:
- Storage is NOT shared
- Each Pod has its own disk
- Data persists even if Pod restarts
Ordered deployment
Pods are created sequentially:
- mysql-0 → mysql-1 → mysql-2
If an earlier Pod fails, later Pods may not start.
This is critical for:
- Kafka clusters
- Distributed systems
✅ Real-world insight (StatefulSet)
Most problems come from:
- PVC stuck in Pending
- StorageClass issues
- Volume mount failures
Not from StatefulSet itself.
3. What is a DaemonSet?
A DaemonSet ensures:
One Pod runs on every node
This makes it completely different from StatefulSet.
Key characteristics
- One Pod per node
- Automatically runs on new nodes
- Automatically removed when node is removed
Real use cases
DaemonSet is mainly used for:
- Logging agents (Fluentd, Logstash)
- Monitoring agents (Node Exporter)
- Security tools
- Networking components (CNI plugins)
4. How DaemonSet works internally
DaemonSet is driven by node availability.
Node-based scheduling
DaemonSet does not use replicas.
Instead:
- Node exists → Pod is created
- Node removed → Pod is removed
Automatic scaling
You don’t scale DaemonSet manually:
- 3 nodes → 3 Pods
- 5 nodes → 5 Pods
Scaling happens automatically with cluster size.
Node selection control
DaemonSets can target specific nodes:
nodeSelector:
disktype: ssd
This allows:
- Running workloads only on selected nodes
- Filtering environment-specific workloads
Interaction with node drain
During node draining:
kubectl drain <node-name> --ignore-daemonsets
DaemonSet Pods are NOT removed.
This is expected because:
- They are tied to node-level operations
- They must always exist per node
✅ Real-world insight (DaemonSet)
Common issues:
- Logs missing → DaemonSet not running
- Metrics missing → monitoring agent issue
- New node added → DaemonSet not scheduled
5. Key difference between StatefulSet and DaemonSet
StatefulSet
- Used for stateful applications
- Pods have identity and storage
- Number of replicas is defined manually
- Order of deployment matters
DaemonSet
- Used for node-level workloads
- One Pod runs per node
- Auto-scales with cluster nodes
- No concept of identity
Simple way to remember
- StatefulSet → “I care about data and identity”
- DaemonSet → “I want this running everywhere”
6. When should you use them?
Use StatefulSet when:
- Application needs persistent storage
- Pods must maintain identity
- Startup order matters
Examples:
- Database clusters
- Kafka brokers
Use DaemonSet when:
- You need something on every node
- It is infrastructure-level component
- It should auto-scale with nodes
Examples:
- Log collectors
- Monitoring agents
7. Common mistakes
Using StatefulSet for stateless apps
Adds unnecessary complexity and slows down deployment.
Using Deployment for databases
Leads to:
- Data loss
- Unstable behavior
Ignoring DaemonSet
Leads to:
- Missing logs
- Incomplete monitoring
8. Real debugging connection
When something fails:
- Logs missing → check DaemonSet
- Database issues → check StatefulSet
- Pod recreated with new identity → wrong controller used
Most issues come from:
👉 Choosing the wrong controller
🚀 InfraDecode Takeaway
StatefulSet is about stable identity and persistent data, while DaemonSet is about node-level coverage.
If your workload depends on storage and order, use StatefulSet.
If your workload must run on every node, use DaemonSet.
Choosing the right controller prevents most real-world failures.
Discover more from
Subscribe to get the latest posts sent to your email.
