Skip to main content
Policy-as-Code with OPA Gatekeeper: Stopping Cloud Waste Before It Deploys

Policy-as-Code with OPA Gatekeeper: Stopping Cloud Waste Before It Deploys

SCPs block cloud-level overprovisioning but can't see inside a Kubernetes cluster. OPA Gatekeeper fills the admission control gap — blocking wasteful pod specs before they ever schedule.

Muskan Bandta By Muskan Bandta
Published: April 16, 2026 8 min read

Your cloud account has Service Control Policies. Your Terraform pipelines have compliance checks. Your tagging strategy covers 90% of resources. And last Thursday, a developer deployed a pod requesting 64 CPU cores and 256 GB of memory into a dev namespace with no resource quotas.

SCPs operate at the cloud API layer. They can prevent someone from launching a p4d.24xlarge instance. They cannot see what happens inside a Kubernetes cluster. Between the cloud guardrails and the running workload, there is an enforcement gap at the Kubernetes admission layer. OPA Gatekeeper fills that gap.

The Gap Between Cloud Guardrails and Kubernetes Reality

Cloud governance tools work at the infrastructure boundary. AWS SCPs restrict which API calls an account can make. Azure Policy controls which resource types can be created. GCP Organization Policies set constraints on project-level operations. None of them see a Kubernetes pod spec.

Inside the cluster, 59% of containers run without CPU limits. Only 13% of requested CPU is actually used. Organizations waste 35% of their Kubernetes spending on overprovisioned resources. The waste does not come from missing cloud policies. It comes from missing admission policies.

Governance layers from cloud to runtime

Each layer catches what the previous one misses. SCPs block prohibited instance types. CI checks catch misconfigured Terraform. Gatekeeper blocks pods that violate cost policies at deploy time. Runtime monitoring flags workloads that drift after deployment. The admission layer is the last gate before a workload starts consuming resources and generating cost.

How Gatekeeper Works: ConstraintTemplates, Constraints, and the Audit Loop

Gatekeeper deploys as a validating admission webhook. When any resource is created, updated, or deleted, the Kubernetes API server pauses the request and sends it to Gatekeeper. Gatekeeper evaluates the request against active constraints using the OPA Rego policy engine, then returns allow or deny. This adds approximately 50ms to admission latency with proper tuning.

Admission webhook flow from kubectl to allow/deny

The system uses two custom resource types. A ConstraintTemplate defines the policy logic in Rego and declares what parameters it accepts. A Constraint instantiates a template with specific values. For example, one ConstraintTemplate defines “containers must have CPU limits below a threshold.” A Constraint instantiates it with the specific threshold: 4 CPU cores for dev namespaces, 16 for production.

Gatekeeper also runs an audit controller that periodically scans existing resources against all active constraints. This catches resources that were deployed before the constraint existed. Violations appear in the Constraint’s status field, giving operators a real-time inventory of non-compliant workloads.

OPA is a CNCF Graduated project with 4,100 GitHub stars. Production adopters include Goldman Sachs, Netflix, Pinterest, and T-Mobile. 56% of OPA users run it specifically for Kubernetes admission control.

Six Constraints That Stop Cloud Waste at the Admission Layer

The Gatekeeper Library provides ready-made ConstraintTemplates. These six target cost waste directly.

ConstraintWhat It EnforcesKey ParametersWhat It Prevents
K8sContainerLimitsMax CPU and memory per containercpu: 4, memory: 8GiA single pod consuming an entire node
K8sContainerResourcesEvery container must declare requests and limitsenforceRequests: truePods deployed with no resource boundaries
K8sReplicaLimitsMax replica count per Deploymentmax_replicas: 20Runaway HPA scaling to 200 replicas
K8sStorageClassRestrict which StorageClasses are allowedallowedClasses: gp3, standardDeveloper claiming io2 premium SSD for a test database
K8sBlockNodePortDeny NodePort servicesnoneUnnecessary load balancer costs from exposed NodePorts
Custom: deny-gpu-namespaceDeny GPU requests outside ML namespacesallowedNamespaces: ml-trainingData engineer requesting 4 A100 GPUs for a Jupyter notebook in the dev namespace

Beyond the library, teams build custom constraints for their cost model. Common patterns include forcing dev namespace pods to tolerate spot or preemptible nodes, capping PersistentVolumeClaim size at 100 Gi in non-production environments, and denying container images above a size threshold to reduce egress and storage costs.

The key design principle: constraints should encode the cost boundaries that already exist as tribal knowledge. If the team informally agrees that dev pods should not exceed 4 CPU cores, that agreement should be a constraint, not a Slack reminder.

From Dry-Run to Enforce: The Rollout That Doesn’t Break Production

Deploying Gatekeeper constraints in enforce mode on day one is a recipe for blocked deployments and pager alerts. The three enforcement actions — dryrun, warn, and deny — exist for staged rollout.

Three-phase rollout: dryrun to warn to deny

Dry-run (weeks 1-2): Deploy all constraints with enforcementAction: dryrun. The audit controller scans existing resources and populates violation counts in each Constraint’s status. No deployments are blocked. Share the violation report with team leads. This builds awareness without friction.

Warn (weeks 3-4): Move high-confidence constraints to enforcementAction: warn. Developers see warnings on kubectl apply but their deployments proceed. This is where teams fix existing violations and establish an exception process for legitimate edge cases.

Deny (week 5+): Move validated constraints to enforcementAction: deny. Non-compliant deployments are blocked with a clear error message. New constraints always enter through dryrun first. Quarterly reviews adjust thresholds as team needs change.

The exit criterion for each phase: violation count trending downward with no legitimate workloads blocked.

Performance, Gotchas, and What Gatekeeper Cannot Do

Gatekeeper adds approximately 50ms to admission latency when properly configured. Misconfiguration — particularly setting too many webhook threads relative to available CPU — can spike latency to 3 seconds. The key tuning parameter is matching the thread count to the pod’s CPU allocation.

GotchaSymptomFix
Thread misconfigurationAdmission latency spikes to 3+ secondsSet max-serving-threads equal to GOMAXPROCS; match to CPU allocation
Audit controller overloadHigh memory usage on Gatekeeper podsLimit audit scope with namespace exclusions; increase audit interval
Constraint conflictTwo constraints give contradictory requirementsUse dryrun mode to detect conflicts before enforcing
Missing exception processDevelopers bypass Gatekeeper by deploying to unmonitored namespacesApply constraints cluster-wide with explicit namespace exemptions rather than opt-in
Rego complexityPolicy evaluation timeout on complex constraint chainsKeep individual Rego policies under 50 lines; compose simple policies rather than building monolithic ones

The most important limitation: Gatekeeper only operates at admission time. It cannot right-size a running pod, scale down idle deployments, or optimize existing resource requests. It prevents waste from entering the cluster. It does not remediate waste that already exists. Pair it with runtime tools — VPA for right-sizing, scheduled scaling for non-production environments, and cost monitoring for drift detection.


Policy-as-code at the Kubernetes admission layer is the enforcement gap between cloud-level SCPs and runtime monitoring. Gatekeeper fills it with 50ms latency, a library of ready-made cost constraints, and a staged rollout model that builds compliance without breaking deployments. Start with the six constraints in this article on dryrun. Audit what you find. Enforce what matters. The 35% of Kubernetes spend currently wasted on overprovisioned resources is addressable — but only if the guardrails exist before the deployment reaches the scheduler.

Muskan Bandta

Written by

Muskan Bandta Author

Engineer at Zop.Dev

ZopDev Resources

Stay in the loop

Get the latest articles, ebooks, and guides
delivered to your inbox. No spam, unsubscribe anytime.