Strimzi Kafka Cluster Configurationยถ
KafkaNodePool Configurationยถ
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
name: dual-role
namespace: kafka
labels:
strimzi.io/cluster: cluster
spec:
replicas: 3
roles:
- controller
- broker
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 250Gi
class: gp3
deleteClaim: true
kraftMetadata: shared
template:
pod:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kafka
operator: In
values:
- "true"
tolerations:
- key: "kafka"
operator: "Equal"
value: "true"
effect: "NoSchedule"
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
strimzi.io/cluster: cluster
๐๏ธ KafkaNodePool Configuration Explainedยถ
This section describes the key elements of the KafkaNodePool manifest for Strimzi Kafka, optimized for a dedicated 3-node worker pool.
๐ท๏ธ Metadata & Labelsยถ
- name:
dual-roleโ Identifies this node pool. - namespace:
kafkaโ Deploys resources in thekafkanamespace. - labels:
strimzi.io/cluster: clusterโ Associates this node pool with the main Kafka cluster.
๐ข Specยถ
-
replicas:
3
Deploys three Kafka pods, matching your three dedicated worker nodes. -
roles:
controller-
broker
Each pod acts as both a controller and broker, supporting KRaft mode. -
storage:
- type:
jbodโ Allows multiple storage volumes. - volumes:
- id:
0 - type:
persistent-claim - size:
250Gi - class:
gp3 - deleteClaim:
true - kraftMetadata:
shared
Each pod gets a 250Gi persistent volume using thegp3storage class.
- id:
๐๏ธ Pod Schedulingยถ
-
nodeAffinity:
Ensures pods are scheduled only on nodes labeledkafka=true. -
tolerations:
Allows pods to run on nodes tainted withkafka=true:NoSchedule, ensuring only dedicated Kafka nodes are used. -
topologySpreadConstraints:
- maxSkew:
1
Ensures pods are evenly distributed across nodes (no node has more than one pod difference). - topologyKey:
kubernetes.io/hostname
Spreads pods by node hostname. - whenUnsatisfiable:
DoNotSchedule
Prevents scheduling if even distribution is not possible. - labelSelector:
Applies only to pods withstrimzi.io/cluster: cluster.
Why This Matters
This configuration guarantees high availability and fault tolerance by ensuring each Kafka pod is isolated on its own dedicated worker node, with persistent storage and strict scheduling
Kafka Configurationยถ
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: cluster
namespace: kafka
annotations:
strimzi.io/node-pools: enabled
strimzi.io/kraft: enabled
spec:
kafkaExporter:
topicRegex: ".*"
groupRegex: ".*"
kafka:
version: 3.9.0
metadataVersion: "3.9"
jvmOptions:
-Xms: 4G # Initial heap size
-Xmx: 4G # Maximum heap size
-XX:
UseG1GC: "true" # Use G1 Garbage Collector
G1HeapRegionSize: 16M # Region size for G1 GC
#UnlockExperimentalVMOptions: "true" # Unlock experimental options
#G1NewSizePercent: 20M # New generation size
#G1MaxNewSizePercent: 40M # Max new generation size
MaxGCPauseMillis: "20" # Target max GC pause time
InitiatingHeapOccupancyPercent: "35" # Start GC when heap occupancy is 35%
MinMetaspaceFreeRatio: "50" # Keep at least 50% of metaspace free
MaxMetaspaceFreeRatio: "80" # Allow up to 80% of
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
log.retention.check.interval.ms: 300000 # 5 minutes (more frequent with short retention)
# For 1-hour topics:
log.segment.bytes: 1073741824 # 1GB (large segments for fewer rotations)
log.segment.ms: 300000 # 5min (rotate frequently)
log.retention.bytes: -1 # Disable size-based retention
#log.retention.ms: 3600000 # 1hr (default, override per-topic)
log.cleanup.policy: delete # Not compacted
log.cleaner.threads: 2 # Minimal cleanup overhead
compression.type: lz4 # Efficient compression
# CPU/Threading (Critical for 2 vCPUs)
num.network.threads: 3 # Default (do not exceed vCPUs)
num.io.threads: 4 # Slightly higher than cores (for disk I/O)
background.threads: 2 # For background tasks (e.g., log cleaning)
socket.send.buffer.bytes: 1024000 # Optimize network buffers
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: kafka-metrics-config.yml
template:
clusterCaCert:
metadata:
annotations:
reflector.v1.k8s.emberstack.com/reflection-auto-enabled: "true"
reflector.v1.k8s.emberstack.com/reflection-allowed: "true"
reflector.v1.k8s.emberstack.com/reflection-allowed-namespaces: "sms,sia,sim,mno,audit,timer,scheduled-jobs"
reflector.v1.k8s.emberstack.com/reflection-auto-namespaces: "sms,sia,sim,mno,audit,timer,scheduled-jobs"
entityOperator:
topicOperator: {}
userOperator: {}
โก Strimzi Kafka Cluster Resource Explainedยถ
This section describes the main Kafka custom resource manifest for deploying a Kafka cluster using Strimzi in Kubernetes.
๐ท๏ธ Metadataยถ
- name:
cluster
The name of the Kafka cluster. - namespace:
kafka
Deploys the cluster in thekafkanamespace. - annotations:
strimzi.io/node-pools: enabledโ Enables node pool support.strimzi.io/kraft: enabledโ Enables KRaft mode (no Zookeeper).
๐ข Specยถ
- kafkaExporter:
-
Exports metrics for all topics and groups for monitoring.
-
kafka:
- version:
3.9.0
Specifies the Kafka version. - jvmOptions:
Configures JVM heap size and garbage collection for optimal performance. - listeners:
- Internal listeners for both plain (non-TLS) and TLS traffic.
-
config:
- Sets replication factors for offsets and transaction logs to
3for high availability. min.insync.replicas: 2ensures data safety during broker failures.- Log retention and segment settings optimize storage and performance for short-lived data.
- Compression is set to
lz4for efficient storage. - Thread and buffer settings are tuned for typical 2 vCPU nodes.
- Sets replication factors for offsets and transaction logs to
-
metricsConfig:
- Uses JMX Prometheus Exporter for metrics.
-
Loads configuration from the
kafka-metricsConfigMap. -
template:
-
Customizes cluster CA certificate annotations for cross-namespace secret reflection.
-
entityOperator:
- Enables both Topic and User Operators for automated topic and user management.
Why This Matters
This configuration provides a production-ready, highly available Kafka cluster with built-in monitoring, optimized JVM and log settings, and automated topic/user management. It is designed for AWS EKS or similar Kubernetes environments
KafkaTopic Configurationยถ
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: db.customers.sia.api
labels:
strimzi.io/cluster: cluster
spec:
topicName: db_customers-sia.api
partitions: 6
replicas: 3
config:
retention.ms: 3600000
๐๏ธ KafkaTopic Resource Explainedยถ
The following KafkaTopic manifest defines a Kafka topic managed by Strimzi in your Kubernetes cluster.
๐ท๏ธ Metadataยถ
- name:
db.customers.sia.api
The logical name for the topic resource in Kubernetes. - labels:
strimzi.io/cluster: cluster
Associates this topic with the main Kafka cluster managed by Strimzi.
๐ข Specยถ
- topicName:
db_customers-sia.api
The actual Kafka topic name created in the cluster. - partitions:
6
The topic will be split into 6 partitions, enabling parallelism and higher throughput. - replicas:
3
Each partition will have 3 replicas, ensuring data redundancy and high availability. - config:
- retention.ms:
3600000
Messages in this topic will be retained for 1 hour (3,600,000 milliseconds).
Best Practices
- Using multiple partitions improves scalability and consumer performance.
- Setting replicas to match the number of Kafka brokers ensures fault tolerance.
- Adjust
retention.msbased on your application's data retention