Your cloud account has Service Control Policies. Your Terraform pipelines have compliance checks. Your tagging strategy covers 90% of resources. And last Thursday, a developer deployed a pod requesting 64 CPU cores and 256 GB of memory into a dev namespace with no resource quotas.
SCPs operate at the cloud API layer. They can prevent someone from launching a p4d.24xlarge instance. They cannot see what happens inside a Kubernetes cluster. Between the cloud guardrails and the running workload, there is an enforcement gap at the Kubernetes admission layer. OPA Gatekeeper fills that gap.
The Gap Between Cloud Guardrails and Kubernetes Reality
Cloud governance tools work at the infrastructure boundary. AWS SCPs restrict which API calls an account can make. Azure Policy controls which resource types can be created. GCP Organization Policies set constraints on project-level operations. None of them see a Kubernetes pod spec.
Inside the cluster, 59% of containers run without CPU limits. Only 13% of requested CPU is actually used. Organizations waste 35% of their Kubernetes spending on overprovisioned resources. The waste does not come from missing cloud policies. It comes from missing admission policies.

Each layer catches what the previous one misses. SCPs block prohibited instance types. CI checks catch misconfigured Terraform. Gatekeeper blocks pods that violate cost policies at deploy time. Runtime monitoring flags workloads that drift after deployment. The admission layer is the last gate before a workload starts consuming resources and generating cost.
How Gatekeeper Works: ConstraintTemplates, Constraints, and the Audit Loop
Gatekeeper deploys as a validating admission webhook. When any resource is created, updated, or deleted, the Kubernetes API server pauses the request and sends it to Gatekeeper. Gatekeeper evaluates the request against active constraints using the OPA Rego policy engine, then returns allow or deny. This adds approximately 50ms to admission latency with proper tuning.

The system uses two custom resource types. A ConstraintTemplate defines the policy logic in Rego and declares what parameters it accepts. A Constraint instantiates a template with specific values. For example, one ConstraintTemplate defines “containers must have CPU limits below a threshold.” A Constraint instantiates it with the specific threshold: 4 CPU cores for dev namespaces, 16 for production.
Gatekeeper also runs an audit controller that periodically scans existing resources against all active constraints. This catches resources that were deployed before the constraint existed. Violations appear in the Constraint’s status field, giving operators a real-time inventory of non-compliant workloads.
OPA is a CNCF Graduated project with 4,100 GitHub stars. Production adopters include Goldman Sachs, Netflix, Pinterest, and T-Mobile. 56% of OPA users run it specifically for Kubernetes admission control.
Six Constraints That Stop Cloud Waste at the Admission Layer
The Gatekeeper Library provides ready-made ConstraintTemplates. These six target cost waste directly.
| Constraint | What It Enforces | Key Parameters | What It Prevents |
|---|---|---|---|
| K8sContainerLimits | Max CPU and memory per container | cpu: 4, memory: 8Gi | A single pod consuming an entire node |
| K8sContainerResources | Every container must declare requests and limits | enforceRequests: true | Pods deployed with no resource boundaries |
| K8sReplicaLimits | Max replica count per Deployment | max_replicas: 20 | Runaway HPA scaling to 200 replicas |
| K8sStorageClass | Restrict which StorageClasses are allowed | allowedClasses: gp3, standard | Developer claiming io2 premium SSD for a test database |
| K8sBlockNodePort | Deny NodePort services | none | Unnecessary load balancer costs from exposed NodePorts |
| Custom: deny-gpu-namespace | Deny GPU requests outside ML namespaces | allowedNamespaces: ml-training | Data engineer requesting 4 A100 GPUs for a Jupyter notebook in the dev namespace |
Beyond the library, teams build custom constraints for their cost model. Common patterns include forcing dev namespace pods to tolerate spot or preemptible nodes, capping PersistentVolumeClaim size at 100 Gi in non-production environments, and denying container images above a size threshold to reduce egress and storage costs.
The key design principle: constraints should encode the cost boundaries that already exist as tribal knowledge. If the team informally agrees that dev pods should not exceed 4 CPU cores, that agreement should be a constraint, not a Slack reminder.
From Dry-Run to Enforce: The Rollout That Doesn’t Break Production
Deploying Gatekeeper constraints in enforce mode on day one is a recipe for blocked deployments and pager alerts. The three enforcement actions — dryrun, warn, and deny — exist for staged rollout.

Dry-run (weeks 1-2): Deploy all constraints with enforcementAction: dryrun. The audit controller scans existing resources and populates violation counts in each Constraint’s status. No deployments are blocked. Share the violation report with team leads. This builds awareness without friction.
Warn (weeks 3-4): Move high-confidence constraints to enforcementAction: warn. Developers see warnings on kubectl apply but their deployments proceed. This is where teams fix existing violations and establish an exception process for legitimate edge cases.
Deny (week 5+): Move validated constraints to enforcementAction: deny. Non-compliant deployments are blocked with a clear error message. New constraints always enter through dryrun first. Quarterly reviews adjust thresholds as team needs change.
The exit criterion for each phase: violation count trending downward with no legitimate workloads blocked.
Performance, Gotchas, and What Gatekeeper Cannot Do
Gatekeeper adds approximately 50ms to admission latency when properly configured. Misconfiguration — particularly setting too many webhook threads relative to available CPU — can spike latency to 3 seconds. The key tuning parameter is matching the thread count to the pod’s CPU allocation.
| Gotcha | Symptom | Fix |
|---|---|---|
| Thread misconfiguration | Admission latency spikes to 3+ seconds | Set max-serving-threads equal to GOMAXPROCS; match to CPU allocation |
| Audit controller overload | High memory usage on Gatekeeper pods | Limit audit scope with namespace exclusions; increase audit interval |
| Constraint conflict | Two constraints give contradictory requirements | Use dryrun mode to detect conflicts before enforcing |
| Missing exception process | Developers bypass Gatekeeper by deploying to unmonitored namespaces | Apply constraints cluster-wide with explicit namespace exemptions rather than opt-in |
| Rego complexity | Policy evaluation timeout on complex constraint chains | Keep individual Rego policies under 50 lines; compose simple policies rather than building monolithic ones |
The most important limitation: Gatekeeper only operates at admission time. It cannot right-size a running pod, scale down idle deployments, or optimize existing resource requests. It prevents waste from entering the cluster. It does not remediate waste that already exists. Pair it with runtime tools — VPA for right-sizing, scheduled scaling for non-production environments, and cost monitoring for drift detection.
Policy-as-code at the Kubernetes admission layer is the enforcement gap between cloud-level SCPs and runtime monitoring. Gatekeeper fills it with 50ms latency, a library of ready-made cost constraints, and a staged rollout model that builds compliance without breaking deployments. Start with the six constraints in this article on dryrun. Audit what you find. Enforce what matters. The 35% of Kubernetes spend currently wasted on overprovisioned resources is addressable — but only if the guardrails exist before the deployment reaches the scheduler.