Understanding DaemonSet and StatefulSet in Kubernetes (With Real Use Cases)


Introduction

In Kubernetes, not all workloads behave the same way.

Some applications need:

  • Persistent storage
  • Stable identity
  • Ordered startup

Others need to run on every node in the cluster.

This is why Kubernetes provides different controllers like StatefulSet and DaemonSet.

Choosing the wrong one can lead to:

  • Data loss
  • Missing logs
  • Unstable applications

Understanding these two properly helps avoid most production mistakes.


1. What is a StatefulSet?

A StatefulSet is used for applications that require:

  • Stable identity
  • Persistent storage
  • Ordered deployment

Unlike Deployments, Pods in a StatefulSet are not interchangeable.


Key characteristics

  • Each Pod has a fixed name
mysql-0
mysql-1
mysql-2

Each Pod has its own storage

  • Pods are created and deleted in order
  • Pod identity remains same after restart

Real use cases

StatefulSet is commonly used for:

  • Databases (MySQL, PostgreSQL, MongoDB)
  • Messaging systems (Kafka)
  • Distributed systems (Elasticsearch, ZooKeeper)

2. How StatefulSet works internally

StatefulSet enforces strict rules around identity, networking, and storage.


Stable network identity

Each Pod gets a predictable DNS name:

<pod-name>.<service-name>.<namespace>.svc.cluster.local

Example:

mysql-0.mysql.default.svc.cluster.local

This allows applications to communicate with a specific Pod directly.


Headless Service requirement

StatefulSet requires a Headless Service:

clusterIP: None

Why this matters:

  • Normal services load-balance traffic
  • Stateful apps need direct access to specific Pods

Storage behavior (very important)

Each Pod gets its own PersistentVolume using:

volumeClaimTemplates

Important points:

  • Storage is NOT shared
  • Each Pod has its own disk
  • Data persists even if Pod restarts

Ordered deployment

Pods are created sequentially:

  • mysql-0 → mysql-1 → mysql-2

If an earlier Pod fails, later Pods may not start.

This is critical for:

  • Kafka clusters
  • Distributed systems

✅ Real-world insight (StatefulSet)

Most problems come from:

  • PVC stuck in Pending
  • StorageClass issues
  • Volume mount failures

Not from StatefulSet itself.


3. What is a DaemonSet?

A DaemonSet ensures:

One Pod runs on every node

This makes it completely different from StatefulSet.


Key characteristics

  • One Pod per node
  • Automatically runs on new nodes
  • Automatically removed when node is removed

Real use cases

DaemonSet is mainly used for:

  • Logging agents (Fluentd, Logstash)
  • Monitoring agents (Node Exporter)
  • Security tools
  • Networking components (CNI plugins)

4. How DaemonSet works internally

DaemonSet is driven by node availability.


Node-based scheduling

DaemonSet does not use replicas.

Instead:

  • Node exists → Pod is created
  • Node removed → Pod is removed

Automatic scaling

You don’t scale DaemonSet manually:

  • 3 nodes → 3 Pods
  • 5 nodes → 5 Pods

Scaling happens automatically with cluster size.


Node selection control

DaemonSets can target specific nodes:

nodeSelector:
  disktype: ssd

This allows:

  • Running workloads only on selected nodes
  • Filtering environment-specific workloads

Interaction with node drain

During node draining:

kubectl drain <node-name> --ignore-daemonsets

DaemonSet Pods are NOT removed.

This is expected because:

  • They are tied to node-level operations
  • They must always exist per node

✅ Real-world insight (DaemonSet)

Common issues:

  • Logs missing → DaemonSet not running
  • Metrics missing → monitoring agent issue
  • New node added → DaemonSet not scheduled

5. Key difference between StatefulSet and DaemonSet


StatefulSet

  • Used for stateful applications
  • Pods have identity and storage
  • Number of replicas is defined manually
  • Order of deployment matters

DaemonSet

  • Used for node-level workloads
  • One Pod runs per node
  • Auto-scales with cluster nodes
  • No concept of identity

Simple way to remember

  • StatefulSet → “I care about data and identity”
  • DaemonSet → “I want this running everywhere”

6. When should you use them?


Use StatefulSet when:

  • Application needs persistent storage
  • Pods must maintain identity
  • Startup order matters

Examples:

  • Database clusters
  • Kafka brokers

Use DaemonSet when:

  • You need something on every node
  • It is infrastructure-level component
  • It should auto-scale with nodes

Examples:

  • Log collectors
  • Monitoring agents

7. Common mistakes


Using StatefulSet for stateless apps

Adds unnecessary complexity and slows down deployment.


Using Deployment for databases

Leads to:

  • Data loss
  • Unstable behavior

Ignoring DaemonSet

Leads to:

  • Missing logs
  • Incomplete monitoring

8. Real debugging connection

When something fails:

  • Logs missing → check DaemonSet
  • Database issues → check StatefulSet
  • Pod recreated with new identity → wrong controller used

Most issues come from:

👉 Choosing the wrong controller


🚀 InfraDecode Takeaway

StatefulSet is about stable identity and persistent data, while DaemonSet is about node-level coverage.
If your workload depends on storage and order, use StatefulSet.
If your workload must run on every node, use DaemonSet.
Choosing the right controller prevents most real-world failures.


Discover more from

Subscribe to get the latest posts sent to your email.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top