NAT Gateway is one of those AWS line items that grows quietly. You provision it once, route all outbound traffic through it, and forget about it. Three months later it’s your third-largest compute cost and nobody knows why.
We measured this across a 3-AZ production cluster running 400 pods. The bill was $4,200/month for NAT Gateway alone. After two targeted fixes, it dropped to $2,016. Here’s exactly what we did.
What NAT Gateway Actually Costs
Most teams know the $0.045/hour fee — roughly $32/month per gateway. That part is predictable. The surprise is the data processing fee.
Every gigabyte that flows through a NAT Gateway costs $0.045/GB. For a cluster processing 20 TB/month in outbound traffic, that is $920/month in processing fees on top of the gateway hours.
| Cost Component | Rate | Example (20 TB/month) |
|---|---|---|
| Gateway hours (1 NAT) | $0.045/hour | $32/month |
| Data processing | $0.045/GB | $921/month |
| Cross-AZ data transfer | $0.01/GB | Depends on architecture |
| Total (single NAT, 3 AZs) | ~$1,800/month |
The cross-AZ charge is the part most teams miss entirely.
The Cross-AZ Traffic Tax
AWS charges $0.01/GB when data crosses availability zone boundaries. In a standard 3-AZ setup with a single NAT Gateway, every pod in the two “wrong” AZs pays this surcharge automatically.
The total cost for cross-AZ NAT traffic is $0.055/GB — the $0.045 NAT processing fee plus the $0.01 cross-AZ fee. For a workload moving 10 TB/month through cross-AZ NAT, that is $563/month in avoidable charges.
In our cluster, 65% of pods lived in AZs without a NAT Gateway. That cross-AZ charge alone was $1,100/month.
Fix 1: One NAT Gateway Per AZ
The first fix is straightforward: deploy one NAT Gateway in each AZ and route each subnet to its local gateway.
The math: moving from 1 to 3 NAT Gateways adds $64/month in gateway hours. But eliminating 10 TB of cross-AZ traffic saves $100/month. Net saving: $36/month per 10 TB.
At our traffic volume, per-AZ NAT saved $1,100/month and cost $64/month extra in gateway fees. Net: $1,036/month saved.
This fix works best when your workloads are evenly spread across AZs. It breaks the math when 90% of your pods are in one AZ already — in that case, the extra gateway hours aren’t worth it.
Fix 2: VPC Endpoints for S3 and DynamoDB
Gateway VPC endpoints for S3 and DynamoDB are free. Zero cost. Traffic routed through them bypasses NAT Gateway entirely — no processing fee, no cross-AZ charge.
In production clusters, S3 traffic is usually the biggest NAT Gateway consumer. Log shipping, artifact downloads, backup uploads — all of it goes through NAT by default.
We added gateway endpoints for S3 and DynamoDB with a single Terraform change. Route tables updated automatically. After 7 days of monitoring, S3 traffic dropped off NAT Gateway completely. That was 8 TB/month of data processing charges — $360/month — gone.
Combined with the per-AZ NAT fix, total saving was $1,396/month. Monthly bill: $4,200 → $2,804. Over 12 months that is $16,752 back.
This fix works when your workloads make heavy use of S3 or DynamoDB. It does not help when most traffic is to external APIs — the endpoint only covers AWS services.
What to Measure First
Before changing anything, pull these metrics from CloudWatch and Cost Explorer.
| What to Check | Where to Look | What It Tells You |
|---|---|---|
BytesOutToDestination per NAT | CloudWatch → VPC → NAT Gateway | Total data volume through each gateway |
BytesInFromSource per subnet | CloudWatch → VPC → per-subnet | Which subnets generate most traffic |
| Cross-AZ data transfer | Cost Explorer → filter by “Data Transfer” | How much cross-AZ overhead you’re paying |
| S3 data transfer via NAT | Cost Explorer → filter by “NAT Gateway” + S3 | Opportunity for VPC endpoint |
Run Cost Explorer with this filter: Service = “EC2 - Other”, Usage Type contains “NatGateway”. This shows your exact NAT processing spend separated from EC2 costs.
If cross-AZ transfer is more than 20% of your total NAT cost, per-AZ gateways will pay for themselves. If S3/DynamoDB traffic is more than 30% of NAT data volume, add gateway endpoints immediately.
We ran both checks in 15 minutes. The data was clear enough to justify the changes before writing a single line of Terraform.