Skip to content

🚀 ArgoCD Repository Overview

📋 Overview

The ArgoCD repository is the single source of truth for deploying and managing all infrastructure and application workloads on the ub-global-us AWS EKS cluster. It implements the App-of-Apps pattern for managing multiple applications through declarative GitOps workflows.

GitOps Principles

All changes to infrastructure and applications are made through Git commits. ArgoCD continuously monitors this repository and automatically syncs changes to the cluster, ensuring the desired state matches the Git state.


🏗️ Directory Structure

argocd/
├── bootstrap/                          # ArgoCD installation and configuration
│   ├── values/
│   │   ├── values.yaml                # Base ArgoCD configuration
│   │   └── values-ub-global-us.yaml   # Environment-specific overrides
│   └── install.sh                      # Installation script
├── general-app/                        # Root App-of-Apps definitions
│   ├── infra-app-of-apps.yaml         # Infrastructure applications parent
│   └── services-app-of-apps.yaml      # Business services parent
├── infra-applications/                 # Infrastructure platform components
│   ├── apps/                          # ArgoCD Application manifests
│   │   ├── cert-manager.yaml
│   │   ├── grafana-loki.yaml
│   │   ├── ingress-nginx.yaml
│   │   ├── kafka.yaml
│   │   ├── kube-prometheus-stack.yaml
│   │   └── promtail.yaml
│   ├── values/                        # Centralized Helm values
│   │   ├── cert-manager/
│   │   │   ├── values.yaml
│   │   │   └── values-ub-global-us.yaml
│   │   ├── grafana-loki/
│   │   │   ├── values.yaml
│   │   │   ├── values-ub-global-us.yaml
│   │   │   ├── values-simple.yaml
│   │   │   ├── values-global.yaml
│   │   │   └── values-6.46.0.yaml
│   │   ├── ingress-nginx/
│   │   ├── kafka/
│   │   ├── kube-prometheus-stack/
│   │   └── promtail/
│   └── kustomization.yaml             # Groups all infrastructure apps
└── services-applications/              # Business application services
    ├── apps/                          # ArgoCD Application manifests
    │   ├── sia-service.yaml
    │   ├── sms-service.yaml
    │   ├── sim-service.yaml
    │   ├── mno-service.yaml
    │   ├── audit-service.yaml
    │   ├── dashboard-service.yaml
    │   ├── timer-service.yaml
    │   └── schedule-service.yaml
    ├── values/                        # Centralized Helm values
    │   ├── sia-service/
    │   │   ├── values.yaml
    │   │   └── values-ub-global-us.yaml
    │   ├── sms-service/
    │   ├── sim-service/
    │   ├── mno-service/
    │   ├── audit-service/
    │   ├── dashboard-service/
    │   ├── timer-service/
    │   └── schedule-service/
    ├── scheduled-jobs/                # Kubernetes CronJob manifests
    │   ├── backup-job.yaml
    │   └── cleanup-job.yaml
    └── kustomization.yaml             # Groups all service apps

🎯 App-of-Apps Pattern

🔍 What is App-of-Apps?

The App-of-Apps pattern is an ArgoCD best practice where a parent ArgoCD Application manages multiple child ArgoCD Applications. This creates a hierarchical structure for organizing applications.

┌────────────────────────────────────┐
│     Root Applications (2)          │
├────────────────────────────────────┤
│  • infra-app-of-apps.yaml         │
│  • services-app-of-apps.yaml      │
└──────────────┬─────────────────────┘
       ┌───────┴────────┐
       ▼                ▼
┌─────────────┐  ┌──────────────┐
│ Infrastructure│  │   Services   │
│  (6 apps)     │  │   (8 apps)   │
├─────────────┤  ├──────────────┤
│ • Loki      │  │ • sia-service│
│ • Prometheus│  │ • sms-service│
│ • Kafka     │  │ • sim-service│
│ • Ingress   │  │ • mno-service│
│ • Cert-Mgr  │  │ • audit-svc  │
│ • Promtail  │  │ • dashboard  │
│             │  │ • timer      │
│             │  │ • schedule   │
└─────────────┘  └──────────────┘

📄 Root App-of-Apps Definitions

Infrastructure App-of-Apps

Location: general-app/infra-app-of-apps.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: infra-app-of-apps
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/organization/argocd.git
    targetRevision: main
    path: infra-applications/apps
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true

Services App-of-Apps

Location: general-app/services-app-of-apps.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: services-app-of-apps
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/organization/argocd.git
    targetRevision: main
    path: services-applications/apps
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true

Deployment Order

Infrastructure applications are deployed first, followed by business services. This ensures that platform components (monitoring, ingress, etc.) are available before applications start.


🗂️ Values File Organization

📐 Values File Structure

Each application follows a layered values approach:

values/
└── <service-name>/
    ├── values.yaml                  # Base configuration (shared across environments)
    └── values-ub-global-us.yaml     # Environment-specific overrides

🔧 Example: Loki Values Structure

values/grafana-loki/
├── values.yaml                      # Not typically used for Loki
├── values-ub-global-us.yaml         # Environment-specific (S3, schema)
├── values-simple.yaml               # Deployment mode configuration
├── values-global.yaml               # Global Loki settings
└── values-6.46.0.yaml              # Chart version defaults

📝 Values File Hierarchy

ArgoCD merges values files in this order (last wins):

  1. Chart defaults (from the Helm chart itself)
  2. values.yaml - Base shared configuration
  3. values-<environment>.yaml - Environment-specific overrides

Override Precedence

Values in values-ub-global-us.yaml will override any matching keys in values.yaml. Structure your values carefully to avoid unintended overrides.

🎨 Values File Best Practices

Base Values (values.yaml)

  • Generic configurations applicable to all environments
  • Non-sensitive defaults
  • Resource requests/limits (can be overridden per environment)
  • Common labels and annotations
# values/sim-service/values.yaml
replicaCount: 2

image:
  repository: your-registry/sim-service
  pullPolicy: IfNotPresent
  # tag overridden in environment-specific values

service:
  type: ClusterIP
  port: 8080

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"

Environment-Specific Values (values-ub-global-us.yaml)

  • Image tags (version pinning)
  • Environment variables
  • Resource overrides for production scale
  • Storage configurations
  • External service endpoints (databases, caches)
# values/sim-service/values-ub-global-us.yaml
image:
  tag: "v1.2.3"

replicaCount: 3

env:
  - name: MONGODB_URI
    value: "mongodb+srv://cluster.mongodb.net"
  - name: REDIS_HOST
    value: "redis.redis-cloud.com"
  - name: ENVIRONMENT
    value: "production"

ingress:
  enabled: true
  className: nginx
  hosts:
    - host: sim-api.unibeam.io
      paths:
        - path: /
          pathType: Prefix

resources:
  requests:
    memory: "512Mi"
    cpu: "500m"
  limits:
    memory: "1Gi"
    cpu: "1000m"

📦 Application Manifests

🎯 Infrastructure Application Example

Location: infra-applications/apps/grafana-loki.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: grafana-loki
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  
  source:
    # Helm chart repository
    repoURL: https://grafana.github.io/helm-charts
    targetRevision: 6.46.0
    chart: loki
    
    # Values file references
    helm:
      valueFiles:
        - $values/infra-applications/values/grafana-loki/values-6.46.0.yaml
        - $values/infra-applications/values/grafana-loki/values-simple.yaml
        - $values/infra-applications/values/grafana-loki/values-global.yaml
        - $values/infra-applications/values/grafana-loki/values-ub-global-us.yaml
  
  # Reference to values repository
  sources:
    - repoURL: https://github.com/organization/argocd.git
      targetRevision: main
      ref: values
  
  destination:
    server: https://kubernetes.default.svc
    namespace: loki
  
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - ServerSideApply=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

🎯 Service Application Example

Location: services-applications/apps/sim-service.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: sim-service
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  
  source:
    # Reference to Kubernetes helm chart repository
    repoURL: https://github.com/organization/kubernetes.git
    targetRevision: main
    path: Helm/sim-service
    
    helm:
      valueFiles:
        - $values/services-applications/values/sim-service/values.yaml
        - $values/services-applications/values/sim-service/values-ub-global-us.yaml
  
  sources:
    - repoURL: https://github.com/organization/argocd.git
      targetRevision: main
      ref: values
  
  destination:
    server: https://kubernetes.default.svc
    namespace: sim
  
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
    retry:
      limit: 3
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 1m

Multi-Source Applications

ArgoCD applications can reference multiple sources. The $values reference allows pulling values from a different repository or path than the Helm chart itself.


🔄 Sync Policies

Automated Sync Options

syncPolicy:
  automated:
    prune: true       # Remove resources not in Git
    selfHeal: true    # Revert manual changes
  syncOptions:
    - CreateNamespace=true      # Auto-create target namespace
    - ServerSideApply=true      # Use server-side apply (recommended for CRDs)
    - PrunePropagationPolicy=foreground  # Deletion ordering
    - RespectIgnoreDifferences=true      # Honor ignoreDifferences config
  retry:
    limit: 5
    backoff:
      duration: 5s
      factor: 2
      maxDuration: 3m

📊 Sync Policy Matrix

Option Infrastructure Apps Service Apps Description
automated.prune true true Remove resources not in Git
automated.selfHeal true true Revert manual changes
CreateNamespace true true Auto-create namespace
ServerSideApply true ⚠️ Sometimes For CRDs and large objects
retry.limit 5 3 Retry count on failure

Self-Heal Implications

With selfHeal: true, any manual changes to resources (e.g., kubectl edit) will be automatically reverted. All changes must go through Git.


🔐 Bootstrap Process

📜 Installation Script

Location: bootstrap/install.sh

#!/bin/bash

# Install ArgoCD
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update

helm upgrade --install argocd argo/argo-cd \
  --namespace argocd \
  --create-namespace \
  --values bootstrap/values/values.yaml \
  --values bootstrap/values/values-ub-global-us.yaml \
  --wait

# Wait for ArgoCD to be ready
kubectl wait --for=condition=ready pod \
  -l app.kubernetes.io/name=argocd-server \
  -n argocd \
  --timeout=300s

# Deploy root App-of-Apps
kubectl apply -f general-app/infra-app-of-apps.yaml
kubectl apply -f general-app/services-app-of-apps.yaml

echo "ArgoCD installation complete!"
echo "Access ArgoCD UI: kubectl port-forward svc/argocd-server -n argocd 8080:443"

🎛️ ArgoCD Configuration

Base Configuration: bootstrap/values/values.yaml

global:
  domain: argocd.unibeam.io

configs:
  params:
    server.insecure: false
    application.instanceLabelKey: argocd.argoproj.io/instance
  
  cm:
    # Repository credentials
    repositories: |
      - url: https://github.com/organization/argocd.git
        type: git
        name: argocd-repo
      
      - url: https://github.com/organization/kubernetes.git
        type: git
        name: kubernetes-repo
    
    # Resource customizations
    resource.customizations: |
      argoproj.io/Application:
        health.lua: |
          hs = {}
          hs.status = "Progressing"
          hs.message = ""
          if obj.status ~= nil then
            if obj.status.health ~= nil then
              hs.status = obj.status.health.status
              if obj.status.health.message ~= nil then
                hs.message = obj.status.health.message
              end
            end
          end
          return hs

server:
  ingress:
    enabled: true
    ingressClassName: nginx
    hosts:
      - argocd.unibeam.io
    tls:
      - secretName: argocd-tls
        hosts:
          - argocd.unibeam.io

Environment Overrides: bootstrap/values/values-ub-global-us.yaml

configs:
  cm:
    url: https://argocd.unibeam.io
    
    # OIDC/SSO configuration
    oidc.config: |
      name: Okta
      issuer: https://company.okta.com
      clientID: <client-id>
      clientSecret: $oidc.okta.clientSecret
      requestedScopes: ["openid", "profile", "email", "groups"]

  rbac:
    policy.default: role:readonly
    policy.csv: |
      p, role:org-admin, applications, *, */*, allow
      p, role:org-admin, clusters, get, *, allow
      p, role:org-admin, repositories, get, *, allow
      
      g, devops-team, role:org-admin

server:
  resources:
    requests:
      memory: "256Mi"
      cpu: "250m"
    limits:
      memory: "512Mi"
      cpu: "500m"

🛠️ Common Operations

📝 Adding a New Service

  1. Create Helm Chart (in kubernetes repository)
cd kubernetes/Helm
mkdir new-service
# Create chart structure
  1. Create Values Files (in argocd repository)
cd argocd/services-applications/values
mkdir new-service
touch new-service/values.yaml
touch new-service/values-ub-global-us.yaml
  1. Create Application Manifest
cd argocd/services-applications/apps

Create new-service.yaml:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: new-service
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/organization/kubernetes.git
    targetRevision: main
    path: Helm/new-service
    helm:
      valueFiles:
        - $values/services-applications/values/new-service/values.yaml
        - $values/services-applications/values/new-service/values-ub-global-us.yaml
  sources:
    - repoURL: https://github.com/organization/argocd.git
      targetRevision: main
      ref: values
  destination:
    server: https://kubernetes.default.svc
    namespace: new-service
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
  1. Update Kustomization

Edit services-applications/kustomization.yaml:

resources:
  # ...existing services...
  - apps/new-service.yaml
  1. Commit and Push
git add .
git commit -m "Add new-service application"
git push origin main

Auto-Sync

ArgoCD will automatically detect the new application and deploy it within 3 minutes (default polling interval).

🔄 Updating Service Configuration

  1. Modify Values File
cd argocd/services-applications/values/sim-service
vi values-ub-global-us.yaml

Update the image tag:

image:
  tag: "v1.2.4"  # Changed from v1.2.3
  1. Commit and Push
git add values-ub-global-us.yaml
git commit -m "Update sim-service to v1.2.4"
git push origin main
  1. ArgoCD Auto-Sync

ArgoCD will: - Detect the change (within 3 minutes) - Perform a diff - Sync the application - Perform rolling update

Health Checks

Ensure your application has proper readiness probes. ArgoCD considers an application healthy only when all pods are ready.

🚨 Manual Sync (Emergency)

If auto-sync is disabled or you need immediate deployment:

Via CLI:

argocd app sync sim-service --prune

Via UI:

  1. Navigate to ArgoCD UI
  2. Select application
  3. Click "SYNC" → "SYNCHRONIZE"

🔍 Monitoring and Troubleshooting

📊 Application Health States

State Description Action
Healthy All resources healthy ✅ No action
Progressing Deployment in progress ⏳ Wait
Degraded Some resources unhealthy 🔍 Investigate
Suspended Application suspended ⏸️ Review suspend reason
Missing Resources not found ❌ Check manifests
Unknown Health cannot be determined ⚠️ Check logs

🐛 Common Issues

❌ Application Not Syncing

Symptoms: - Application shows "OutOfSync" but doesn't auto-sync - Last sync was long ago

Solutions:

# Check sync policy
kubectl get app sim-service -n argocd -o yaml | grep -A10 syncPolicy

# Force sync
argocd app sync sim-service

# Check application logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller

❌ Sync Fails with "Permission Denied"

Symptoms: - Sync fails with RBAC errors

Solutions:

# Check service account permissions
kubectl get clusterrole argocd-application-controller -o yaml

# Check if namespace exists
kubectl get namespace <target-namespace>

# Ensure CreateNamespace sync option is set
kubectl get app sim-service -n argocd -o yaml | grep CreateNamespace

❌ Values Not Applied

Symptoms: - Configuration changes not reflected

Solutions:

# Verify values file path
argocd app manifests sim-service | grep -A20 "values"

# Check if multiple value files conflict
argocd app get sim-service --show-params

# Hard refresh
argocd app get sim-service --hard-refresh

📈 Monitoring Best Practices

# Watch application sync status
watch argocd app list

# Get detailed application information
argocd app get sim-service

# View sync history
argocd app history sim-service

# Check application events
kubectl get events -n sim --sort-by='.lastTimestamp'

# View ArgoCD logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-server --tail=100 -f

🎯 Best Practices

✅ Repository Organization

  1. Separation of Concerns
  2. Keep infrastructure and services separate
  3. Use App-of-Apps for logical grouping
  4. Maintain clear directory structure

  5. Values File Management

  6. Base values for shared configuration
  7. Environment-specific overrides for differences
  8. Never commit secrets (use external secret management)

  9. Version Control

  10. Pin Helm chart versions in production
  11. Use semantic versioning for application tags
  12. Tag releases for rollback capability

✅ Deployment Strategies

  1. Phased Rollouts

    # In service values
    strategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0
    

  2. Health Checks

    livenessProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 30
    
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 10
    

  3. Resource Management

    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "500m"
    

✅ Security Practices

  1. Principle of Least Privilege
  2. Limit ArgoCD service account permissions
  3. Use project-based access control
  4. Regular RBAC audits

  5. Secret Management

  6. Use external secret stores (AWS Secrets Manager)
  7. Never commit plain text secrets
  8. Rotate credentials regularly

  9. Image Security

  10. Pin image tags (never use latest)
  11. Scan images for vulnerabilities
  12. Use private registries


📚 External Resources


Last Updated: January 2025
Maintained by: DevOps Team