Single Region Multi-AZ Resiliency

Transitioning from Multi-Region to a mirrored US-East (2 AZ) topology. The objective: Maximize High Availability while eliminating "Cold Path" failure risks.

Explore Strategy

The Core Dilemma

1
Consolidation: Moving all workloads to one region to reduce cost and complexity.
2
The Conflict: Party_A prefers Active/Passive (fear of locking). Party_B prefers Active/Warm (fear of silent failure).
3
The Goal: Prove that Multi-AZ Active/Active is safe and superior.

Architecture Topology

A mirrored stack across two Availability Zones. Use the toggles to visualize traffic flow.

☁️

Cloudflare (WAF)

HTTPS / API

⚡

AWS Global Accel

TCP / SIM

AWS US Region

Internet Gateway (IGW)

Zone A Primary

Ingress Firewall

Network Firewall Endpoint

ALB / NLB

Load Balancing

EKS Cluster A

App Pods • Kafka Workers

Zone B Standby

Ingress Firewall

Network Firewall Endpoint

ALB / NLB

Load Balancing

EKS Cluster B

App Pods • Kafka Workers

Shared Data Plane (Active/Active)

🔴 Redis Cluster

🍃 MongoDB Atlas

Decision Matrix

Select a traffic strategy to analyze the operational impact.

Analysis: This option provides the best balance. By sending 10% traffic to Zone B, we validate network paths, firewall rules, and IAM permissions continuously without the complexity of full bi-directional scaling.

Risk vs. Value Profile

Traffic Distribution

Failover Time

~5s

Resource Waste

Low

Debunking the "Locking" Myth

The fear of database locking is inherited from Multi-Region architectures where latency is high. In a Single-Region Multi-AZ setup, the physics change completely.

Sub-Millisecond Latency

Latency between AZs is < 2ms. To Redis and Mongo, this looks like a local LAN. Consensus protocols (Raft/Paxos) handle this transparently.

No Application Locks Needed

MongoDB uses Primary/Secondary election. Even in Active/Active, apps write to the *same* Primary. Redis uses CRDTs (Active-Active) to merge writes mathematically.

Latency Impact on Consistency

Lower latency = Lower risk of sync issues.