Skip to content

Monitoringยถ

TOCยถ

TOC-ThisREADMEยถ

๐Ÿ“ˆ Monitoring Overview ๐Ÿšฆยถ

This guide describes the monitoring setup for Unibeam microservices running on AWS EKS.
All applications expose /metrics and /health endpoints on port 8101.
We use Kube-Prometheus-Stack for metrics and alerting, and Loki with Promtail for centralized log aggregation.


๐Ÿ› ๏ธ Metrics Collection with Kube-Prometheus-Stackยถ

All Unibeam services (e.g., audit-service, mno-service, scheduled-jobs, sia-service, sim-service, sms-service, timer-service, dashboard-service) expose Prometheus-compatible metrics at:

  • Endpoint: /metrics
  • Port: 8101

Kube-Prometheus-Stack is deployed in the monitoring namespace and automatically discovers these endpoints using Kubernetes ServiceMonitors.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
    name: unibeam-app-monitor
    namespace: monitoring
spec:
    selector:
        matchLabels:
            app: unibeam-app
    endpoints:
        - port: http
            path: /metrics
            interval: 30s

ServiceMonitor Setup

Ensure each service has the correct labels and is targeted by a ServiceMonitor for automatic scraping.

๐Ÿ“Š Health Checksยถ

All apps expose a health endpoint for readiness and liveness probes:

  • Endpoint: /health
  • Port: 8101

Configure Kubernetes probes as follows:

livenessProbe:
    httpGet:
        path: /health
        port: 8101
    initialDelaySeconds: 10
    periodSeconds: 30
readinessProbe:
    httpGet:
        path: /health
        port: 8101
    initialDelaySeconds: 5
    periodSeconds: 10

Health Endpoint

The `/health` endpoint should return HTTP 200 when the service is healthy.

๐Ÿ“š Log Aggregation with Loki & Promtailยถ

  • Loki is deployed in the loki namespace for centralized log storage and querying.
  • Promtail runs in the promtail namespace and is responsible for collecting logs from all pods across the cluster and shipping them to Loki.

All logs are searchable in Grafana using labels such as namespace, app, and pod. Promtail is configured to push logs to the Loki gateway endpoint and can filter out logs from infrastructure namespaces to reduce noise.

# Promtail Config Example
apiVersion: v1
kind: ConfigMap
metadata:
  name: promtail-config
  namespace: promtail
data:
  promtail.yaml: |
    clients:
      - url: http://logz-loki-gateway.loki/loki/api/v1/push
        # Optional: external_labels for multi-region setups
        # external_labels:
        #   region: us-west-2
        #   tenant_id: 1
    snippets:
      pipelineStages:
        - drop:
            source: "namespace"
            expression: "(kube-system|kube-public|promtail|loki|thanos|monitoring|argocd|strimzi|kafka|twistlock|scheduled-jobs|reflector|karpenter)"
        - cri: {}
    scrape_configs:
      - job_name: kubernetes-pods
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_namespace]
            target_label: namespace
          - source_labels: [__meta_kubernetes_pod_name]
            target_label: pod

Promtail Filtering

The pipeline stages above drop logs from infrastructure namespaces to keep application logs focused and relevant.

Log Search

Use Grafana to query logs by `namespace`, `app`, or `pod` for troubleshooting and auditing.

๐Ÿ›ก๏ธ Namespace Best Practicesยถ

Namespace Purpose
monitoring Metrics & alerting
loki Log aggregation
promtail Log shipping
Application workloads

Isolation

Keep monitoring and logging components in dedicated namespaces for security and scalability.

๐Ÿงฐ Kube-Prometheus-Stack Componentsยถ

Kube-Prometheus-Stack is a comprehensive monitoring solution for Kubernetes clusters.
It bundles several key services and tools for metrics, alerting, visualization, and monitoring:

Service Purpose
Prometheus Collects and stores metrics from Kubernetes and application endpoints.
Alertmanager Manages alerts sent by Prometheus, including routing and notifications.
Grafana Visualizes metrics and logs with customizable dashboards.
Node Exporter Collects hardware and OS metrics from cluster nodes.
Kube State Metrics Exposes cluster resource metrics (Deployments, Pods, etc).
Prometheus Operator Simplifies deployment and management of Prometheus resources.
Blackbox Exporter Enables synthetic monitoring (HTTP, TCP, ICMP probes).
Pushgateway Allows ephemeral jobs to push metrics to Prometheus.
ServiceMonitors & PodMonitors Discover and scrape metrics from services and pods.
Custom Rules & Alerts Predefined and user-defined Prometheus alerting rules.

Stack Coverage

The stack covers infrastructure, application, and custom metrics, alerting, and visualization needs for Kubernetes environments.

Extensibility

You can extend the stack with additional exporters or custom dashboards as needed.

For more details, see the Kube-Prometheus-Stack Documentation.

๐Ÿ”— Referencesยถ