Monitoringยถ
TOCยถ
TOC-ThisREADMEยถ
- Monitoring
- TOC
- TOC-ThisREADME
- ๐ Monitoring Overview ๐ฆ
- ๐ ๏ธ Metrics Collection with Kube-Prometheus-Stack
- ๐ Health Checks
- ๐ Log Aggregation with Loki \& Promtail
- ๐ก๏ธ Namespace Best Practices
- ๐งฐ Kube-Prometheus-Stack Components
- ๐ References
๐ Monitoring Overview ๐ฆยถ
This guide describes the monitoring setup for Unibeam microservices running on AWS EKS.
All applications expose /metrics and /health endpoints on port 8101.
We use Kube-Prometheus-Stack for metrics and alerting, and Loki with Promtail for centralized log aggregation.
๐ ๏ธ Metrics Collection with Kube-Prometheus-Stackยถ
All Unibeam services (e.g., audit-service, mno-service, scheduled-jobs, sia-service, sim-service, sms-service, timer-service, dashboard-service) expose Prometheus-compatible metrics at:
- Endpoint:
/metrics - Port:
8101
Kube-Prometheus-Stack is deployed in the monitoring namespace and automatically discovers these endpoints using Kubernetes ServiceMonitors.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: unibeam-app-monitor
namespace: monitoring
spec:
selector:
matchLabels:
app: unibeam-app
endpoints:
- port: http
path: /metrics
interval: 30s
ServiceMonitor Setup
Ensure each service has the correct labels and is targeted by a ServiceMonitor for automatic scraping.
๐ Health Checksยถ
All apps expose a health endpoint for readiness and liveness probes:
- Endpoint:
/health - Port:
8101
Configure Kubernetes probes as follows:
livenessProbe:
httpGet:
path: /health
port: 8101
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 8101
initialDelaySeconds: 5
periodSeconds: 10
Health Endpoint
The `/health` endpoint should return HTTP 200 when the service is healthy.
๐ Log Aggregation with Loki & Promtailยถ
- Loki is deployed in the
lokinamespace for centralized log storage and querying. - Promtail runs in the
promtailnamespace and is responsible for collecting logs from all pods across the cluster and shipping them to Loki.
All logs are searchable in Grafana using labels such as namespace, app, and pod.
Promtail is configured to push logs to the Loki gateway endpoint and can filter out logs from infrastructure namespaces to reduce noise.
# Promtail Config Example
apiVersion: v1
kind: ConfigMap
metadata:
name: promtail-config
namespace: promtail
data:
promtail.yaml: |
clients:
- url: http://logz-loki-gateway.loki/loki/api/v1/push
# Optional: external_labels for multi-region setups
# external_labels:
# region: us-west-2
# tenant_id: 1
snippets:
pipelineStages:
- drop:
source: "namespace"
expression: "(kube-system|kube-public|promtail|loki|thanos|monitoring|argocd|strimzi|kafka|twistlock|scheduled-jobs|reflector|karpenter)"
- cri: {}
scrape_configs:
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_pod_name]
target_label: pod
Promtail Filtering
The pipeline stages above drop logs from infrastructure namespaces to keep application logs focused and relevant.
Log Search
Use Grafana to query logs by `namespace`, `app`, or `pod` for troubleshooting and auditing.
๐ก๏ธ Namespace Best Practicesยถ
| Namespace | Purpose |
|---|---|
| monitoring | Metrics & alerting |
| loki | Log aggregation |
| promtail | Log shipping |
| Application workloads |
Isolation
Keep monitoring and logging components in dedicated namespaces for security and scalability.
๐งฐ Kube-Prometheus-Stack Componentsยถ
Kube-Prometheus-Stack is a comprehensive monitoring solution for Kubernetes clusters.
It bundles several key services and tools for metrics, alerting, visualization, and monitoring:
| Service | Purpose |
|---|---|
| Prometheus | Collects and stores metrics from Kubernetes and application endpoints. |
| Alertmanager | Manages alerts sent by Prometheus, including routing and notifications. |
| Grafana | Visualizes metrics and logs with customizable dashboards. |
| Node Exporter | Collects hardware and OS metrics from cluster nodes. |
| Kube State Metrics | Exposes cluster resource metrics (Deployments, Pods, etc). |
| Prometheus Operator | Simplifies deployment and management of Prometheus resources. |
| Blackbox Exporter | Enables synthetic monitoring (HTTP, TCP, ICMP probes). |
| Pushgateway | Allows ephemeral jobs to push metrics to Prometheus. |
| ServiceMonitors & PodMonitors | Discover and scrape metrics from services and pods. |
| Custom Rules & Alerts | Predefined and user-defined Prometheus alerting rules. |
Stack Coverage
The stack covers infrastructure, application, and custom metrics, alerting, and visualization needs for Kubernetes environments.
Extensibility
You can extend the stack with additional exporters or custom dashboards as needed.
For more details, see the Kube-Prometheus-Stack Documentation.