Skip to content

Strimzi Kafka Cluster Configurationยถ

KafkaNodePool Configurationยถ

kafka-dual-role.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
  name: dual-role
  namespace: kafka
  labels:
    strimzi.io/cluster: cluster
spec:
  replicas: 3
  roles:
  - controller
  - broker
  storage:
    type: jbod
    volumes:
    - id: 0
      type: persistent-claim
      size: 250Gi
      class: gp3
      deleteClaim: true
      kraftMetadata: shared
  template:
    pod:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kafka
                    operator: In
                    values:
                      - "true"
      tolerations:
        - key: "kafka"
          operator: "Equal"
          value: "true"
          effect: "NoSchedule"
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              strimzi.io/cluster: cluster

๐Ÿ—‚๏ธ KafkaNodePool Configuration Explainedยถ

This section describes the key elements of the KafkaNodePool manifest for Strimzi Kafka, optimized for a dedicated 3-node worker pool.


๐Ÿท๏ธ Metadata & Labelsยถ

  • name: dual-role โ€” Identifies this node pool.
  • namespace: kafka โ€” Deploys resources in the kafka namespace.
  • labels:
  • strimzi.io/cluster: cluster โ€” Associates this node pool with the main Kafka cluster.

๐Ÿ”ข Specยถ

  • replicas: 3
    Deploys three Kafka pods, matching your three dedicated worker nodes.

  • roles:

  • controller
  • broker
    Each pod acts as both a controller and broker, supporting KRaft mode.

  • storage:

  • type: jbod โ€” Allows multiple storage volumes.
  • volumes:
    • id: 0
    • type: persistent-claim
    • size: 250Gi
    • class: gp3
    • deleteClaim: true
    • kraftMetadata: shared
      Each pod gets a 250Gi persistent volume using the gp3 storage class.

๐Ÿ—๏ธ Pod Schedulingยถ

  • nodeAffinity:
    Ensures pods are scheduled only on nodes labeled kafka=true.

  • tolerations:
    Allows pods to run on nodes tainted with kafka=true:NoSchedule, ensuring only dedicated Kafka nodes are used.

  • topologySpreadConstraints:

  • maxSkew: 1
    Ensures pods are evenly distributed across nodes (no node has more than one pod difference).
  • topologyKey: kubernetes.io/hostname
    Spreads pods by node hostname.
  • whenUnsatisfiable: DoNotSchedule
    Prevents scheduling if even distribution is not possible.
  • labelSelector:
    Applies only to pods with strimzi.io/cluster: cluster.

Why This Matters

This configuration guarantees high availability and fault tolerance by ensuring each Kafka pod is isolated on its own dedicated worker node, with persistent storage and strict scheduling


Kafka Configurationยถ

kafka.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: cluster
  namespace: kafka
  annotations:
    strimzi.io/node-pools: enabled
    strimzi.io/kraft: enabled
spec:
  kafkaExporter:
    topicRegex: ".*"
    groupRegex: ".*"
  kafka:
    version: 3.9.0
    metadataVersion: "3.9"
    jvmOptions:
      -Xms: 4G  # Initial heap size
      -Xmx: 4G  # Maximum heap size
      -XX:
        UseG1GC: "true"  # Use G1 Garbage Collector
        G1HeapRegionSize: 16M  # Region size for G1 GC
        #UnlockExperimentalVMOptions: "true"  # Unlock experimental options
        #G1NewSizePercent: 20M  # New generation size
        #G1MaxNewSizePercent: 40M  # Max new generation size
        MaxGCPauseMillis: "20"  # Target max GC pause time
        InitiatingHeapOccupancyPercent: "35"  # Start GC when heap occupancy is 35%
        MinMetaspaceFreeRatio: "50"  # Keep at least 50% of metaspace free
        MaxMetaspaceFreeRatio: "80"  # Allow up to 80% of
    listeners:
    - name: plain
      port: 9092
      type: internal
      tls: false
    - name: tls
      port: 9093
      type: internal
      tls: true
    config:
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
      default.replication.factor: 3
      min.insync.replicas: 2
      log.retention.check.interval.ms: 300000 # 5 minutes (more frequent with short retention)
      # For 1-hour topics:
      log.segment.bytes: 1073741824  # 1GB (large segments for fewer rotations)
      log.segment.ms: 300000        # 5min (rotate frequently)
      log.retention.bytes: -1       # Disable size-based retention
      #log.retention.ms: 3600000     # 1hr (default, override per-topic)
      log.cleanup.policy: delete    # Not compacted
      log.cleaner.threads: 2       # Minimal cleanup overhead
      compression.type: lz4  # Efficient compression
      # CPU/Threading (Critical for 2 vCPUs)
      num.network.threads: 3             # Default (do not exceed vCPUs)
      num.io.threads: 4                  # Slightly higher than cores (for disk I/O)
      background.threads: 2              # For background tasks (e.g., log cleaning)
      socket.send.buffer.bytes: 1024000  # Optimize network buffers
    metricsConfig:
      type: jmxPrometheusExporter
      valueFrom:
        configMapKeyRef:
          name: kafka-metrics
          key: kafka-metrics-config.yml
    template:
      clusterCaCert:
        metadata:
          annotations:
            reflector.v1.k8s.emberstack.com/reflection-auto-enabled: "true"
            reflector.v1.k8s.emberstack.com/reflection-allowed: "true"
            reflector.v1.k8s.emberstack.com/reflection-allowed-namespaces: "sms,sia,sim,mno,audit,timer,scheduled-jobs"
            reflector.v1.k8s.emberstack.com/reflection-auto-namespaces: "sms,sia,sim,mno,audit,timer,scheduled-jobs"
  entityOperator:
    topicOperator: {}
    userOperator: {}

โšก Strimzi Kafka Cluster Resource Explainedยถ

This section describes the main Kafka custom resource manifest for deploying a Kafka cluster using Strimzi in Kubernetes.


๐Ÿท๏ธ Metadataยถ

  • name: cluster
    The name of the Kafka cluster.
  • namespace: kafka
    Deploys the cluster in the kafka namespace.
  • annotations:
  • strimzi.io/node-pools: enabled โ€” Enables node pool support.
  • strimzi.io/kraft: enabled โ€” Enables KRaft mode (no Zookeeper).

๐Ÿ”ข Specยถ

  • kafkaExporter:
  • Exports metrics for all topics and groups for monitoring.

  • kafka:

  • version: 3.9.0
    Specifies the Kafka version.
  • jvmOptions:
    Configures JVM heap size and garbage collection for optimal performance.
  • listeners:
    • Internal listeners for both plain (non-TLS) and TLS traffic.
  • config:

    • Sets replication factors for offsets and transaction logs to 3 for high availability.
    • min.insync.replicas: 2 ensures data safety during broker failures.
    • Log retention and segment settings optimize storage and performance for short-lived data.
    • Compression is set to lz4 for efficient storage.
    • Thread and buffer settings are tuned for typical 2 vCPU nodes.
  • metricsConfig:

  • Uses JMX Prometheus Exporter for metrics.
  • Loads configuration from the kafka-metrics ConfigMap.

  • template:

  • Customizes cluster CA certificate annotations for cross-namespace secret reflection.

  • entityOperator:

  • Enables both Topic and User Operators for automated topic and user management.

Why This Matters

This configuration provides a production-ready, highly available Kafka cluster with built-in monitoring, optimized JVM and log settings, and automated topic/user management. It is designed for AWS EKS or similar Kubernetes environments


KafkaTopic Configurationยถ

kafka-topics.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: db.customers.sia.api
  labels:
    strimzi.io/cluster: cluster
spec:
  topicName: db_customers-sia.api
  partitions: 6
  replicas: 3
  config:
    retention.ms: 3600000

๐Ÿ—‚๏ธ KafkaTopic Resource Explainedยถ

The following KafkaTopic manifest defines a Kafka topic managed by Strimzi in your Kubernetes cluster.


๐Ÿท๏ธ Metadataยถ

  • name: db.customers.sia.api
    The logical name for the topic resource in Kubernetes.
  • labels:
  • strimzi.io/cluster: cluster
    Associates this topic with the main Kafka cluster managed by Strimzi.

๐Ÿ”ข Specยถ

  • topicName: db_customers-sia.api
    The actual Kafka topic name created in the cluster.
  • partitions: 6
    The topic will be split into 6 partitions, enabling parallelism and higher throughput.
  • replicas: 3
    Each partition will have 3 replicas, ensuring data redundancy and high availability.
  • config:
  • retention.ms: 3600000
    Messages in this topic will be retained for 1 hour (3,600,000 milliseconds).

Best Practices

  • Using multiple partitions improves scalability and consumer performance.
  • Setting replicas to match the number of Kafka brokers ensures fault tolerance.
  • Adjust retention.ms based on your application's data retention