Resource Limits and Requests

Configure CPU and memory resources for containers. Understand requests, limits, and Quality of Service classes.

8 min read

Resource Limits and Requests

In the Previous Tutorial, we learned about persistent storage. Now let's talk about something that can save your cluster from total chaos — resource management.

Without resource controls, a single misbehaving container can gobble up all CPU and memory on a node, starving every other application. It's like one person at a buffet taking ALL the food and leaving nothing for everyone else.

Resource requests and limits prevent this by defining what your containers need and what they're allowed to use. Good fences make good neighbors, right?

Requests vs Limits

Two concepts you need to understand:

  • Requests: The minimum resources guaranteed to your container. Think of it as a reservation at a restaurant — "I need at least this much."
  • Limits: The maximum resources your container can use. The bouncer that says "you've had enough."

What happens between request and limit?

That's the "burst zone." Your container can use more than its request (if available on the node) but never more than its limit:

┌────────────────────────────────────────────┐
│               Available on Node            │
│  ┌──────────────────────────────────────┐  │
│  │           Limit (max allowed)        │  │
│  │  ┌────────────────────────────────┐  │  │
│  │  │    Request (guaranteed)        │  │  │
│  │  │                                │  │  │
│  │  │    Your container runs here    │  │  │
│  │  │                                │  │  │
│  │  └────────────────────────────────┘  │  │
│  │         Can burst up to limit        │  │
│  └──────────────────────────────────────┘  │
└────────────────────────────────────────────┘

Why This Matters

Okay, do I really need to set these? What if I just... don't?

Oh boy, let me tell you what happens:

Without requests: The scheduler has no idea how much resource your Pod needs. It might cram 20 Pods on a node that can only handle 5. It's like overbooking a flight — someone's getting bumped.

Without limits: A memory leak can consume ALL node memory, triggering the OOM killer on random pods. A CPU-intensive process can starve other containers. Absolute mayhem.

Set Resource Requests and Limits

Create resource-demo.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: app
    image: nginx
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

Apply:

kubectl apply -f resource-demo.yaml

This container:

  • Is guaranteed 64Mi memory and 0.25 CPU cores
  • Can use up to 128Mi memory and 0.5 CPU cores

Understanding CPU Units

CPU is measured in cores (or millicores). It's a bit weird at first, but you'll get used to it:

ValueMeaning
11 CPU core (the whole thing)
0.5 or 500mHalf a CPU core
100m0.1 CPU cores (100 millicores — a tiny slice)
22 CPU cores

What's with the 'm'?

It stands for "millicores." 1000m = 1 core. It's like how 1000 milliseconds = 1 second. A 4-core node has 4000m of CPU available.

CPU Behavior

Here's the important part:

  • Below request: Container gets what it asked for. Life is good.
  • Between request and limit: Container can burst if CPU is available on the node. Bonus!
  • At limit: Container is throttled (slowed down, NOT killed). It still runs, just slower.

CPU throttling is annoying but survivable. Unlike memory...

Understanding Memory Units

Memory uses standard units:

UnitMeaning
KiKibibytes (1024 bytes)
MiMebibytes (1024 Ki)
GiGibibytes (1024 Mi)
KKilobytes (1000 bytes)
MMegabytes (1000 K)
GGigabytes (1000 M)

Use Mi and Gi (binary units) for consistency with how Linux reports memory.

Memory Behavior

This is where it gets scary:

  • Below request: Container runs normally. All good.
  • Between request and limit: Container can use more if available on the node.
  • Over limit: Container is OOM killed (Out Of Memory). Dead. Gone. Kubernetes restarts it.

Unlike CPU, memory can't be throttled. You can't tell a process to "use memory more slowly." If you exceed the limit, your container gets killed. No warnings. No second chances. It's the death penalty of resource management.

So always set memory limits with some breathing room!

See Resource Usage

Want to see what your Pods are actually consuming? It's like checking your phone's battery usage:

kubectl top pod resource-demo
NAME            CPU(cores)   MEMORY(bytes)
resource-demo   1m           3Mi

For nodes:

kubectl top nodes

Note: Requires metrics-server. On Minikube, enable it with:

minikube addons enable metrics-server

Wait about a minute for metrics to start flowing in. Patience, young grasshopper.

Quality of Service (QoS) Classes

Here's something interesting — Kubernetes assigns QoS classes based on how you configure resources. Think of it like airline ticket classes — it determines who gets bumped first when things get tight.

Guaranteed (First Class)

Requests equal limits for all containers. You asked for exactly what you need — no more, no less:

resources:
  requests:
    memory: "128Mi"
    cpu: "500m"
  limits:
    memory: "128Mi"
    cpu: "500m"
  • Highest priority
  • Last to be evicted
  • Use for critical workloads

Burstable (Business Class)

Requests are set but lower than limits. You have a baseline but can burst when resources are available:

resources:
  requests:
    memory: "64Mi"
    cpu: "250m"
  limits:
    memory: "128Mi"
    cpu: "500m"
  • Medium priority
  • Evicted after BestEffort pods
  • Most common configuration

BestEffort (Economy Class)

No requests or limits set at all:

resources: {}
  • Lowest priority — first to be kicked off the plane
  • First to be evicted when node runs low on resources
  • Avoid for production workloads! Seriously.

Check a Pod's QoS class:

kubectl get pod resource-demo -o jsonpath='{.status.qosClass}'

LimitRange: Default Limits

What if someone forgets to set resource limits on their Pods? You can enforce defaults at the namespace level with LimitRange. Think of it as a safety net.

Create limitrange.yaml:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - default:          # Default limits
      memory: "512Mi"
      cpu: "1"
    defaultRequest:   # Default requests
      memory: "256Mi"
      cpu: "500m"
    max:              # Maximum allowed
      memory: "1Gi"
      cpu: "2"
    min:              # Minimum required
      memory: "64Mi"
      cpu: "100m"
    type: Container

Apply:

kubectl apply -f limitrange.yaml

Now create a Pod without resource specs:

apiVersion: v1
kind: Pod
metadata:
  name: no-resources
spec:
  containers:
  - name: app
    image: nginx

Check what it got:

kubectl get pod no-resources -o jsonpath='{.spec.containers[0].resources}'

The LimitRange injected the defaults automatically. The developer didn't set anything, and Kubernetes gave them sensible defaults. Safety net activated!

ResourceQuota: Namespace Limits

LimitRange sets per-Pod defaults. ResourceQuota sets the total for an entire namespace — like a departmental budget:

Create resourcequota.yaml:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: namespace-quota
spec:
  hard:
    requests.cpu: "4"
    requests.memory: "8Gi"
    limits.cpu: "8"
    limits.memory: "16Gi"
    pods: "20"

Apply:

kubectl apply -f resourcequota.yaml

Check usage:

kubectl describe resourcequota namespace-quota
Name:            namespace-quota
Resource         Used   Hard
--------         ----   ----
limits.cpu       500m   8
limits.memory    128Mi  16Gi
pods             1      20
requests.cpu     250m   4
requests.memory  64Mi   8Gi

If you try to create Pods that would exceed the quota, Kubernetes rejects them. Sorry, budget's spent!

Practical Guidelines

Setting Requests

Don't guess — base requests on actual usage. Run your app and observe:

kubectl top pod <pod-name>

Set requests slightly above average usage.

Setting Limits

  • Memory: Set limit to handle peak usage plus buffer. Too low = OOM kills.
  • CPU: Set limit higher than request. CPU throttling is better than starving.

Common Patterns

Web application:

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Java application (needs more memory):

resources:
  requests:
    memory: "512Mi"
    cpu: "500m"
  limits:
    memory: "1Gi"
    cpu: "1"

Background worker:

resources:
  requests:
    memory: "64Mi"
    cpu: "50m"
  limits:
    memory: "128Mi"
    cpu: "200m"

Troubleshooting

Pod Stuck in Pending

The #1 reason: not enough resources on any node.

kubectl describe pod <pod-name>

Look for:

Events:
  Warning  FailedScheduling  Insufficient cpu
  Warning  FailedScheduling  Insufficient memory

Solutions:

  • Reduce requests
  • Add more nodes
  • Free up resources on existing nodes

OOMKilled

The dreaded Out Of Memory kill. Your container exceeded its memory limit.

kubectl describe pod <pod-name>

Look for:

State:          Terminated
Reason:         OOMKilled
Exit Code:      137

Exit code 137 is the universal "I got OOM killed" signal. Solutions:

  • Increase memory limit
  • Fix the memory leak in your application (the real fix!)
  • Optimize memory usage

CPU Throttling

Check for high throttling:

kubectl top pod <pod-name>

If CPU usage consistently hits the limit, consider increasing it.

Clean Up

kubectl delete pod resource-demo no-resources 2>/dev/null
kubectl delete limitrange default-limits 2>/dev/null
kubectl delete resourcequota namespace-quota 2>/dev/null

What's Next?

You can control resources now. But as your cluster grows, you need to organize resources. In the next tutorial, you'll learn about Namespaces — logical partitions that help you organize and isolate workloads within your cluster.