Skip to main content
Kubernetes Multi-Tenancy: Resource Quotas, Namespace Isolation, and the Cost of Getting It Wrong

Kubernetes Multi-Tenancy: Resource Quotas, Namespace Isolation, and the Cost of Getting It Wrong

Shared clusters without hard quotas become tragedy-of-the-commons cost problems. One team's memory leak becomes everyone's OOM. Here's how LimitRanges, ResourceQuotas, and namespace cost attribution fix that.

Amanpreet Kaur By Amanpreet Kaur
Published: April 17, 2026 9 min read

At 2:47 AM, the payments team gets paged. Their pods are OOMKilled. The cluster has 96 cores and 384GiB of memory across 12 nodes. The payments namespace is using 8 cores and 32GiB — well within its allocation. The problem is the data-pipeline namespace, which has no resource quota and whose nightly ETL job just consumed 14GiB of memory it was never supposed to touch.

This is the failure mode that shared Kubernetes clusters produce when you skip the governance layer. Namespaces without quotas are an open bar. One team’s memory leak, runaway batch job, or forgotten load test becomes every other team’s production incident.

The fix is two Kubernetes primitives — LimitRange and ResourceQuota — applied consistently before any team gets cluster access. Neither is complex. Not applying them before something breaks is significantly more expensive than applying them before something breaks.

The Shared Cluster Cost Problem

A shared cluster operates on a fundamentally different economic model than dedicated per-team clusters. Node capacity is pooled. When one namespace consumes beyond its fair share, it does not pay more — it simply evicts other teams’ pods.

The economics of shared clusters are favorable when they work correctly. A shared 3-node cluster running 8 teams at 40-60% average utilization costs substantially less than 8 separate clusters with their own control plane overhead, node minimums, and per-cluster tooling. The efficiency gains are real only when resource boundaries prevent one tenant from starving another.

Without quotas, three failure modes compound:

A memory leak in a development namespace consumes available node memory gradually over 6 hours. The scheduler cannot place new production pods because no node has sufficient free memory. The cluster appears healthy — CPU is at 30% — but every new deployment fails with Insufficient memory until the offending deployment is found and scaled down.

A batch job with no CPU limit runs at full node capacity during business hours. Other pods on the same node get CPU-throttled. Response times increase but the pods do not crash, making the cause harder to trace. The symptom appears as an application slowdown, not a resource problem.

A developer runs a load test against staging without coordination. The HPA scales the staging deployment to 40 replicas. Production deployments fail because the cluster has no remaining schedulable capacity.

namespace isolation design

LimitRange vs ResourceQuota: What Each Does and When You Need Both

These two objects operate at different scopes and serve different purposes. Using one without the other leaves gaps.

DimensionLimitRangeResourceQuota
ScopePer container or podPer namespace
What it controlsDefault requests/limits, min/max per containerTotal CPU, memory, object count across namespace
EnforcementAdmission controller at pod creationAdmission controller at any resource creation
Behavior when missingPods created without requests/limits (invisible to scheduler)Namespace can consume unlimited cluster resources
Failure mode when absentNode overcommit, OOM evictionsNoisy neighbor starves other namespaces

LimitRange solves the problem of pods created without resource declarations. If a developer writes a Deployment with no resources block, Kubernetes schedules it on a node with no knowledge of what it will actually consume. The node overcommits, and the first memory pressure event triggers evictions. A LimitRange with default values automatically injects requests and limits into any container that omits them.

A sensible LimitRange default for a dev namespace: defaultRequest of 100m CPU and 128Mi memory, default limit of 500m CPU and 512Mi memory, max of 2 CPU and 4Gi memory. This prevents any single container from running unbounded while giving workloads enough headroom for typical development tasks.

ResourceQuota solves the namespace-level consumption problem. It caps the total CPU, memory, and object count that a namespace can consume. When the quota is reached, new pods in that namespace fail admission rather than evicting pods from other namespaces.

A production namespace serving web traffic might get: 20 CPU requests, 40Gi memory requests, 40 CPU limits, 80Gi memory limits. A dev namespace might get 4 CPU requests and 8Gi memory. The gap between dev and production quotas forces teams to be explicit when they need production-equivalent resources for load testing.

limitrange vs resourcequota

Both objects should be applied at cluster bootstrap before any application namespaces are created. Applying them retroactively to a cluster with running workloads requires a migration window — existing pods are not evicted, but new deployments fail if they exceed the quota.

Namespace Design: Patterns That Enable Cost Attribution

The namespace structure determines whether you can answer the question “how much does team X cost per month?” The answer is only possible if the namespace boundary maps cleanly to a cost owner.

Three patterns cover most organizations:

Team-per-namespace: Each engineering team owns one namespace per environment. payments-prod, payments-staging, payments-dev. Cost attribution is exact — every resource in payments-prod belongs to the payments team. Quota management is per-team per-environment.

Product-per-namespace: Each product line owns a namespace. Multiple teams may deploy into the same product namespace. Attribution is to the product, not the team. This works for organizations that charge back at the product or business unit level rather than the team level.

Shared service namespaces: Infrastructure components (monitoring, ingress, cert-manager) live in dedicated namespaces with their own quotas. Platform costs are separated from application costs. This makes the platform team’s resource consumption visible and prevents it from being silently absorbed into per-team cost attribution.

The label schema is what actually enables cost attribution tooling to produce accurate reports:

LabelValuesPurpose
teampayments, data, platformMaps namespace to owning team
environmentprod, staging, devSeparates cost by environment tier
cost-centercc-1042, cc-2031Maps to finance GL code for chargeback
productcheckout, analyticsGroups across team boundaries for product-level reporting

Apply these labels at namespace creation and enforce them via a Gatekeeper constraint that rejects namespace creation without all four labels. Without enforcement, labels drift — teams skip them, use inconsistent values, or forget to update when ownership changes.

Cost Attribution: Namespace-Level Chargeback With Real Numbers

Kubecost and OpenCost both provide namespace-level cost breakdown with accuracy rates around 97% for on-demand node costs. The remaining 3% comes from shared cluster overhead — control plane, DaemonSets, cluster-wide add-ons — which must be allocated proportionally.

A 3-node cluster running 8 namespaces with the following allocation:

NamespaceCPU RequestedMemory RequestedMonthly Node Cost Attribution
payments-prod8 cores32Gi486
data-pipeline6 cores48Gi540
auth-prod4 cores16Gi270
frontend-prod3 cores12Gi189
shared-infra2 cores8Gi135
dev (all teams)4 cores16Gi270
staging3 cores12Gi189
monitoring2 cores8Gi121

Node cost at $0.192/hr (m5.2xlarge) × 3 nodes × 730 hours = $420.48/month. The allocation above distributes that cost to namespace owners based on requested resources, not actual usage. Using actual usage instead of requests is more accurate but creates incentives for teams to under-request resources to minimize their attribution.

cost attribution flow

The practical recommendation: attribute based on requests for the first 6 months to give teams stable, predictable bills. Switch to actual usage attribution once teams have had time to right-size their resource requests and understand what drives their costs.

Failure Modes and the Defaults to Set on Day One

Five failure modes appear repeatedly in clusters that were bootstrapped without multi-tenancy governance:

Failure ModeSymptomFix
No LimitRange on namespacePods created without requests/limits; scheduler overcommits nodeApply LimitRange with sensible defaults before first deployment
ResourceQuota set too tightDeployments fail with QuotaExceeded during rollouts; engineers manually delete old podsSet quota headroom at 2x typical peak, review quarterly
HPA ignores quotaHPA scales beyond namespace quota; new replicas fail admission; traffic dropsSet HPA maxReplicas ≤ (quota CPU limit / pod CPU limit)
Missing cost-center label30% of namespace cost unattributable; finance rejects chargeback reportEnforce label schema via Gatekeeper at namespace creation
Dev namespace shares cluster with prodLoad test in dev causes node pressure affecting prod podsApply taint/toleration separation or use dedicated node pools for prod

The defaults to set on day one, before any application teams get access:

EnvironmentCPU RequestCPU LimitMemory RequestMemory LimitMax Pods
prod20 cores40 cores40Gi80Gi100
staging8 cores16 cores16Gi32Gi50
dev4 cores8 cores8Gi16Gi30

These are starting points, not permanent values. Run Kubecost or OpenCost for 30 days and adjust quotas to match actual peak consumption plus 40% headroom. A quota that is never hit provides no protection. A quota that is constantly hit creates operational friction. The target is a quota hit rate below 5% in steady state.


Shared clusters are worth running. The utilization efficiency, the reduced control plane overhead, the simplified platform tooling — the economics are clear. But shared clusters without quotas are not shared clusters. They are a single team’s cluster that other teams happen to deploy into until the wrong job runs at the wrong time. LimitRanges and ResourceQuotas are the primitives that make multi-tenancy real. Apply them at bootstrap. Review them quarterly. The 2 AM page is optional.

Amanpreet Kaur

Written by

Amanpreet Kaur Author

Engineer at Zop.Dev

ZopDev Resources

Stay in the loop

Get the latest articles, ebooks, and guides
delivered to your inbox. No spam, unsubscribe anytime.