Kafka rebalance
๐งญ Kafka Partition Leadership Rebalanceยถ
โ๏ธ What is Leader Skew?ยถ
Leader skew happens when partition leadership in a Kafka cluster is unevenly distributed. One broker may become the leader for most partitions, causing it to handle much more traffic than others.
๐ท๏ธ Leader vs Follower Rolesยถ
- Leader: Handles all read/write requests for its partition. This role is CPU, network, and I/O intensive.
- Follower: Replicates data from the leader, mainly performing network and disk I/O.
If one broker leads more partitions, it becomes a performance hotspot.
๐ Example: Balanced vs Skewed Leadershipยถ
| Broker ID | Leader For Partitions | Follower For Partitions | Workload |
|---|---|---|---|
| Broker 1 | 1, 2 | 3, 4, 5, 6 | Medium |
| Broker 2 | 3, 4 | 1, 2, 5, 6 | Medium |
| Broker 3 | 5, 6 | 1, 2, 3, 4 | Medium |
Ideal: Leadership is evenly spread.
| Broker ID | Leader For Partitions | Follower For Partitions | Workload |
|---|---|---|---|
| Broker 1 | 1, 2, 3, 4, 5 | 6 | Very High |
| Broker 2 | 6 | 1, 3, 5 | Low |
| Broker 3 | - | 1, 2, 3, 4, 5, 6 | Very Low |
Skewed: Broker 1 is overloaded, Broker 3 is idle.
๐จ Why is Leader Skew a Problem?ยถ
Risks of Leader Skew
- Performance Bottlenecks: Overloaded brokers slow down producers and consumers.
- Cluster Instability: Failure of the busy broker triggers mass leader elections, risking downtime.
- Resource Waste: Some brokers are underutilized while others are overwhelmed.
๐ต๏ธโโ๏ธ Causes of Leader Skewยถ
- Broker failures and recovery
- Topic creation without careful planning
- Adding/removing brokers
- Manual changes to leadership or ISRs
๐ Detecting Leader Skew in Strimziยถ
Use the Kafka CLI inside a broker pod to inspect leader distribution:
kubectl exec -n kafka cluster-dual-role-0 -c kafka -- \
bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe
๐ ๏ธ Fixing Leader Skew: Strimzi Cruise Controlยถ
Automated Balancing with Cruise Control
- Cruise Control monitors your cluster and automatically rebalances partition leadership and replicas.
- Setup: Enable Cruise Control in your Kafka custom resource for hands-off balancing.
Manual leader election (if needed):
kubectl exec -n kafka cluster-dual-role-0 -c kafka -- \
bin/kafka-leader-election.sh --bootstrap-server localhost:9092 \
--election-type preferred --all-topic-partitions
๐ Summaryยถ
Leader skew creates performance hotspots and risks cluster stability.
Best practice: Enable Strimzi Cruise Control for automated, continuous balancing in your EKS Kafka