How to Safely Drain a Kubernetes Node (With Real Commands)


Introduction

In Kubernetes, nodes don’t stay healthy forever.

Sometimes you need to:

  • Perform maintenance
  • Upgrade the node
  • Replace hardware
  • Fix issues

But you cannot just stop the node, because it may be running critical workloads.

That’s where node draining comes in.

Draining safely moves all workloads from a node before you take it down.


1. What does “draining a node” actually mean?

Draining a node means:

✅ Evict all Pods from that node
✅ Move them to other nodes
✅ Keep applications running

After draining:

  • No application pods remain on that node
  • The node is safe to shut down or modify

2. Before you drain — always check this

Before running any command, verify the node status:

kubectl get nodes

Look for:

  • STATUS → Ready ✅
  • Node name you want to drain

3. Step 1: Mark node as unschedulable (cordon)

First, stop new pods from getting scheduled on the node.

kubectl cordon <node-name>

Example:

kubectl cordon worker-node-1

✅ Result:

  • Existing pods run normally
  • New pods will NOT be scheduled

4. Step 2: Drain the node (main command)

Now drain the node:

kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

Example:

kubectl drain worker-node-1 --ignore-daemonsets --delete-emptydir-data

✅ What this command does

  • Evicts all normal Pods
  • Ignores DaemonSet Pods (they stay)
  • Deletes temporary storage pods

5. Important flags (must understand)


🔹 –ignore-daemonsets

DaemonSets run:

  • logging agents
  • monitoring
  • system components

👉 These cannot be drained automatically
👉 Kubernetes skips them safely


🔹 –delete-emptydir-data

Some pods use temporary storage (emptyDir).

👉 This flag allows Kubernetes to delete those pods
👉 Without this, drain may fail


6. What happens during draining (real flow)

When you run drain:

  1. Pods are marked for eviction
  2. Controller creates replacement pods on other nodes
  3. Traffic shifts automatically
  4. Old pods terminate

👉 From user side → no downtime if setup correctly


7. Verify node is drained

After drain:

kubectl get pods -o wide

Check:

  • No application pods running on that node ✅

8. Common errors during drain


❌ Error: “cannot evict pod (no disruption budget)”

This means:

👉 PodDisruptionBudget (PDB) is blocking eviction


🔹 Fix:

Check PDB:

kubectl get pdb

Either:

  • Wait for allowed disruption
  • Adjust PDB (temporary in maintenance window)


❌ Error: “daemonset-managed pods”

Already handled by:

--ignore-daemonsets

✅ This is expected behavior



❌ Error: Pod stuck terminating

Fix:

kubectl delete pod <pod-name> --force --grace-period=0

⚠️ Use only if safe


9. After maintenance (very important)

Once work is done, bring node back:

kubectl uncordon <node-name>

Example:

kubectl uncordon worker-node-1

✅ Now scheduler can use the node again


10. Full real workflow (quick reference)

kubectl get nodes
kubectl cordon <node-name>
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
kubectl get pods -o wide
kubectl uncordon <node-name>

11. When should you drain a node?

Use draining when:

✅ Node maintenance
✅ Kernel updates
✅ Hardware issues
✅ Scaling down cluster
✅ Cloud instance replacement


Final takeaway

🚀 InfraDecode Takeaway

Draining a node is not about shutting it down — it’s about moving workloads safely.
Always think: cordon → drain → verify → uncordon.
Most failures happen due to PDB or storage, not the command itself.
Do it carefully, and Kubernetes will handle the rest.


Discover more from

Subscribe to get the latest posts sent to your email.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top