Kubernetes Network Policies Cut Egress Bills, Not Just Attack Surface

Most teams add NetworkPolicy to their Kubernetes clusters for security reasons. They want microsegmentation. They want blast radius control. Cost is not part of the conversation.

That is a mistake. Unrestricted pod egress is a direct line to NAT Gateway data processing charges, cross-AZ transfer fees, and surprise line items on your AWS bill. NetworkPolicy is both a security control and a cost control. Teams that treat it as only one of those leave real money on the table.

The NAT Gateway Tax Nobody Talks About

AWS charges $0.045 per GB for every byte that passes through a NAT Gateway. That is on top of $0.045 per hour just to keep the gateway running: $32.40/month before a single byte of traffic.

In a cluster without egress NetworkPolicy, every pod that makes an outbound call to an external API, package registry, or telemetry endpoint runs that traffic through NAT. A cluster with 50 workloads, each pulling 100 MB of dependency updates daily, generates 150 GB/month of NAT-processed data. At $0.045/GB, that is $6.75/month on updates alone. Add application telemetry, health check pings to external endpoints, and log shipping, and that number climbs fast.

The real compounding happens at scale. Teams deploy new workloads each sprint. Each workload gets a default-open pod spec. Nobody audits what it calls externally. Six months later, NAT Gateway data processing is a $400/month line item that nobody owns.

Cross-AZ traffic compounds the problem further. AWS charges $0.01 per GB in each direction for data crossing availability zones. A pod in us-east-1a calling a service deployed in us-east-1b pays $0.02/GB round-trip. Without topology-aware routing and without NetworkPolicy guiding traffic to local endpoints, that charge accumulates invisibly alongside the NAT costs.

Why Kubernetes Defaults Are Wide Open

Kubernetes ships with a default-allow networking model. Every pod can reach every other pod and every external endpoint unless something explicitly restricts it. That is not a bug. It is a deliberate design choice to make getting started easy. But “easy to start” and “safe to run at scale” are different requirements.

NetworkPolicy objects define ingress and egress rules for pods. They are namespace-scoped, and they only take effect when your CNI plugin supports enforcement. This is the part most documentation glosses over: the policy object is not enough. You need a policy-aware CNI.

The default AWS VPC CNI (aws-node) does not enforce NetworkPolicy natively. You must install either the AWS Network Policy Controller addon (available in EKS 1.25+), Calico deployed as a standalone policy engine alongside aws-node, or Cilium replacing aws-node entirely with an eBPF-based dataplane.

On GKE, NetworkPolicy enforcement requires enabling the --enable-network-policy flag at cluster creation or upgrading to Dataplane V2 (Cilium-based). On AKS, you choose between Azure CNI with Calico or the Azure Network Policy Manager at cluster creation.

The critical implication: if you have NetworkPolicy objects in your cluster but your CNI does not support enforcement, those policies do nothing. Your pods are still wide open. Audit this before assuming you are protected.

Default vs Policy-Enforced networking model

How NetworkPolicy Actually Reduces Your Bill

The cost reduction mechanism is direct. A default-deny egress policy blocks all outbound traffic from matching pods unless an explicit allow rule permits it. Traffic that never reaches the NAT Gateway generates no data processing charge.

The key insight is that most workloads do not need unrestricted external egress. A backend API pod needs to reach a database and maybe one or two downstream services. It does not need to call arbitrary external APIs. A default-deny egress policy with explicit allows for those specific endpoints eliminates all other outbound traffic from that pod.

Teams that have audited their clusters before applying default-deny policies consistently find 40-60% of egress traffic has no legitimate business purpose. Dependency update calls, misconfigured health checks pinging public endpoints, debug tools calling home. These accumulate silently. A single policy change eliminates all of it.

Scenario	Monthly NAT GB	Data Processing Cost	Cross-AZ Cost
No NetworkPolicy, 50 workloads	820 GB	$36.90	$18.40
Default-deny egress, explicit allows	310 GB	$13.95	$6.20
Default-deny + topology-aware routing	310 GB	$13.95	$1.40
Savings	510 GB	$22.95/month	$17.00/month

One trap to know before you start: NetworkPolicy rules within a namespace are additive with OR logic. If pod A has a policy allowing egress to 10.0.0.0/8, and a second policy allows egress to 0.0.0.0/0, pod A can reach anything. One permissive policy silently undoes all restrictions for every pod it selects. This is the most common way teams think they have enforced egress control but have not.

The correct pattern: one default-deny policy per namespace with a podSelector: {} (matching all pods), then separate allow policies for each workload with specific selectors. Never add a wildcard allow, as that defeats the default-deny entirely.

Building a Default-Deny Egress Policy Without Breaking Production

Do not deploy default-deny on day one. Clusters with undocumented external dependencies will break immediately, and the blast radius is hard to scope in advance. The right sequence is: audit, then enforce.

The audit phase uses existing observability. If you run Cilium, cilium monitor shows real-time traffic flows by pod. With Calico, the Tigera Calico Enterprise flow logs give the same view. Without a policy-aware CNI, you can use VPC Flow Logs filtered to NAT Gateway ENIs, then map source IPs back to pod CIDR blocks.

Spend one to two weeks collecting data. Map every namespace to its external endpoints. Identify which pods genuinely need internet access and which ones are calling external URLs by accident or by legacy configuration.

The enforcement phase starts with non-production namespaces. Apply a default-deny egress policy plus explicit allows for the endpoints your audit identified. Run the workloads for 48 hours, watch for connection failures in application logs, and adjust the allow rules.

When non-prod is stable, roll out namespace by namespace in production. Never apply default-deny cluster-wide in a single change. The failure mode if you get it wrong is broken external calls, which can cascade through service dependencies faster than you can roll back.

This approach works well for stateless workloads. It breaks when pods use dynamic external endpoints that are not known at policy-write time — for example, a pod that calls arbitrary customer-provided webhook URLs. In that case, default-deny egress is not viable for that specific workload. Accept the exception, document it, and apply default-deny to every other namespace.

Choosing Your CNI: Calico, Cilium, or AWS Network Policy Controller

The CNI choice determines what NetworkPolicy features you get and how much operational overhead you accept.

Calico is the most widely deployed option. It runs as a daemonset alongside the AWS VPC CNI on EKS, leaving the existing CNI’s IPAM and routing intact while adding policy enforcement. Setup is straightforward: install the Tigera Calico operator, and your existing NetworkPolicy objects start being enforced immediately. Calico also supports GlobalNetworkPolicy (a Calico-specific CRD) which lets you apply cluster-wide default-deny rules without writing a policy per namespace.

Cilium replaces the CNI entirely. Its eBPF dataplane bypasses iptables, which matters at scale: iptables rule processing time grows linearly with rule count, while eBPF is O(1). Cilium also provides Hubble, a network observability layer that shows per-connection flow data without additional tooling.

AWS Network Policy Controller (available since EKS 1.25) is the zero-friction option for EKS users. It works natively with the AWS VPC CNI, requires no CNI replacement, and is installable as a managed EKS addon. The tradeoff: it enforces only standard Kubernetes NetworkPolicy spec with no extensions.

CNI Option	Enforcement Layer	EKS Setup	Observability	Cluster-Wide Policy
AWS NPC addon	iptables via VPC CNI	Managed addon	None built-in	No
Calico (standalone)	iptables + nftables	Operator install	Flow logs (Enterprise)	GlobalNetworkPolicy CRD
Cilium	eBPF (replaces CNI)	Full CNI migration	Hubble (built-in)	CiliumClusterwideNetworkPolicy

For teams already working on Kubernetes multi-tenancy with namespace isolation, default-deny NetworkPolicy is a natural extension: the same namespace boundary that isolates resource quotas also becomes the unit of egress control.

Connecting NetworkPolicy to Your FinOps Process

NetworkPolicy enforcement changes your NAT Gateway cost trajectory. But it only produces lasting savings if the policy is maintained as workloads change. That requires connecting the security and cost disciplines together.

The practical mechanism is a monthly egress review. Run VPC Flow Logs or Cilium Hubble queries to find pods generating the most NAT Gateway traffic. Compare against their NetworkPolicy allow rules. If a pod is hitting allowed endpoints at high volume, those are legitimate calls to optimize at the application layer. If a pod is hitting endpoints not in your allowlist, your policy has a gap.

Teams that treat this as a one-time setup see savings erode over 6-12 months as new workloads deploy with incomplete policy coverage. Teams that make egress policy review part of their regular cloud cost governance practice hold the savings.

A sudden spike in NAT Gateway data processing is almost always a new workload without egress policy. Alerting on that metric gives you a feedback loop that catches gaps before they compound.

NetworkPolicy does not automatically understand your workload’s external dependencies or write allow rules for you. The audit work is manual. The policy maintenance is ongoing. But the cost reduction is real, the security benefit is real, and the two come from the same configuration change. Most cost controls in cloud infrastructure require dedicated engineering effort for financial benefit alone. This one delivers both simultaneously.