Learning Kubernetes - Introduction and Explanation of Computational Resources

#Introduction

In the previous episode, we learned about Kubernetes Dashboard for managing clusters through a web interface. In episode 29, we'll discuss Computational Resources, specifically CPU and memory management - one of the most critical aspects of running production workloads in Kubernetes.

Note: Here I'll be using a Kubernetes Cluster installed through K3s.

Understanding resource requests and limits is essential for cluster stability, efficient resource utilization, and preventing one application from starving others. Without proper resource management, your cluster can become unstable, unpredictable, and expensive.

#What Are Computational Resources?

Computational Resources in Kubernetes refer to CPU and memory that containers can consume. Kubernetes allows you to specify how much of these resources a container needs (requests) and the maximum it can use (limits).

Think of resources like a restaurant reservation - requests are your guaranteed table (minimum resources), while limits are the maximum number of people you can bring (maximum resources). The restaurant (node) needs to know both to manage seating effectively.

Key resource types:

CPU - Measured in cores (or millicores)
Memory - Measured in bytes (Ki, Mi, Gi)
Ephemeral Storage - Temporary disk space
Extended Resources - Custom resources (GPUs, etc.)

#Requests vs Limits

Understanding the difference is crucial.

#Requests

Requests define the minimum amount of resources guaranteed to a container.

Used by scheduler to decide which node can run the Pod
Container is guaranteed this amount
Node must have available resources >= requests
Used for resource reservation
Affects Pod scheduling decisions

#Limits

Limits define the maximum amount of resources a container can use.

Container cannot exceed this amount
If exceeded, container is throttled (CPU) or killed (memory)
Used for resource protection
Prevents resource exhaustion
Affects runtime behavior

#Relationship Between Requests and Limits

Rules:

Requests ≤ Limits (always)
Requests = what you need
Limits = what you might need at peak

#Why Always Specify Resources?

Let's understand why resource specification is critical.

#Problem 1: Unpredictable Scheduling

Without requests:

Issues:

Scheduler doesn't know Pod requirements
May schedule on overloaded nodes
Can cause node resource exhaustion
Unpredictable performance

With requests:

Benefits:

Scheduler makes informed decisions
Pods scheduled on appropriate nodes
Predictable resource availability
Better cluster utilization

#Problem 2: Resource Starvation

Without limits:

Issues:

One Pod can consume all node resources
Other Pods starved of resources
Node becomes unstable
Cascading failures

With limits:

Benefits:

Container cannot exceed limits
Other Pods protected
Node remains stable
Predictable behavior

#Problem 3: Cost Inefficiency

Without proper resources:

Over-provisioning wastes money
Under-provisioning causes failures
No visibility into actual usage
Difficult to optimize costs

With proper resources:

Right-sized allocations
Efficient resource utilization
Clear cost attribution
Easy to optimize

#Problem 4: Quality of Service (QoS)

Kubernetes assigns QoS classes based on resources:

Guaranteed (highest priority):

Requests = Limits for all containers
Best performance guarantee
Last to be evicted

Burstable (medium priority):

Requests < Limits
Can use extra resources when available
Evicted before Guaranteed

BestEffort (lowest priority):

No requests or limits
Uses whatever is available
First to be evicted

Without resource specifications, Pods get BestEffort QoS - the worst class.

#CPU Resources

CPU is measured in cores or millicores.

#CPU Units

#CPU Behavior

CPU is compressible:

Container can be throttled if exceeding limit
Container is not killed for exceeding CPU limit
Performance degrades but container continues running

Example:

Behavior:

Guaranteed 250 millicores
Can burst up to 500 millicores
If trying to use more than 500m, gets throttled
Never killed for CPU usage

#CPU Throttling

When container exceeds CPU limit:

Throttled containers show:

High CPU usage (near limit)
Increased response times
Degraded performance

#Memory Resources

Memory is measured in bytes with standard units.

#Memory Units

Note

Use binary units (Ki, Mi, Gi) for consistency with how operating systems report memory.

#Memory Behavior

Memory is incompressible:

Container cannot be throttled for memory
If exceeding limit, container is killed (OOMKilled)
Pod may be restarted depending on restart policy

Example:

Behavior:

Guaranteed 256 MiB
Can use up to 512 MiB
If trying to use more than 512 MiB, gets OOMKilled
Pod restarted if restartPolicy allows

#OOMKilled (Out of Memory Killed)

When container exceeds memory limit:

#Resource Specification Examples

#Basic Web Application

#Database Application

#Background Worker

#Microservice with Sidecar

#Quality of Service (QoS) Classes

Kubernetes assigns QoS classes automatically based on resource specifications.

#Guaranteed QoS

Requirements:

Every container has requests and limits
Requests = Limits for CPU and memory

Characteristics:

Highest priority
Best performance guarantee
Last to be evicted under pressure
Predictable resource allocation

#Burstable QoS

Requirements:

At least one container has requests or limits
Requests < Limits (or only one specified)

Characteristics:

Medium priority
Can burst above requests when resources available
Evicted before Guaranteed, after BestEffort
Flexible resource usage

#BestEffort QoS

Requirements:

No requests or limits specified

Characteristics:

Lowest priority
No resource guarantees
First to be evicted under pressure
Unpredictable performance

Warning

Warning: Avoid BestEffort QoS in production. Always specify at least requests for predictable behavior.

#Checking QoS Class

#Resource Quotas

Limit total resources in a namespace.

#Creating Resource Quota

Apply quota:

Check quota usage:

Output:

#Quota Enforcement

When quota is exceeded:

#Limit Ranges

Set default and min/max resources for containers.

#Creating Limit Range

Apply limit range:

#Limit Range Behavior

Without resources specified:

Kubernetes applies defaults:

#Monitoring Resource Usage

#Using kubectl top

View node resources:

Output:

View Pod resources:

Output:

View Pod resources in namespace:

#Using Metrics Server

Metrics Server must be installed:

Verify Metrics Server:

#Viewing Resource Allocation

Check node allocatable resources:

Output shows:

#Common Mistakes and Pitfalls

#Mistake 1: No Resource Specifications

Problem: Pods without resources get BestEffort QoS.

Solution: Always specify at least requests:

#Mistake 2: Limits Too Low

Problem: Containers frequently OOMKilled or throttled.

Solution: Monitor actual usage and adjust:

#Mistake 3: Requests Too High

Problem: Pods can't be scheduled due to over-requesting.

Solution: Set requests based on actual minimum needs:

#Mistake 4: Requests = Limits Always

Problem: Wastes resources, prevents bursting.

Solution: Allow bursting for variable workloads:

#Mistake 5: Ignoring Multi-Container Pods

Problem: Forgetting to set resources for all containers.

Solution: Specify resources for every container:

#Best Practices

#Start with Monitoring

Before setting resources:

Deploy without limits initially (with requests)
Monitor actual usage for 1-2 weeks
Set limits based on observed peak usage + buffer
Continue monitoring and adjust

#Use Appropriate Ratios

CPU:

Requests: baseline usage
Limits: 2-3x requests for burstable workloads
Limits: 1x requests for consistent workloads

Memory:

Requests: minimum needed
Limits: 1.5-2x requests (smaller buffer than CPU)
Memory leaks can cause OOMKills

#Set Namespace Defaults

Use LimitRange for consistent defaults:

#Use Resource Quotas

Prevent namespace resource exhaustion:

#Different Resources for Different Workloads

Web servers:

Databases:

Batch jobs:

#Document Resource Decisions

Add annotations explaining resource choices:

#Troubleshooting Resource Issues

#Pod Pending Due to Insufficient Resources

Solutions:

Reduce resource requests
Add more nodes
Delete unused Pods
Scale down other workloads

#Pod OOMKilled

Solutions:

Increase memory limits
Fix memory leaks in application
Optimize application memory usage
Use memory profiling tools

#CPU Throttling

Solutions:

Increase CPU limits
Optimize application CPU usage
Scale horizontally (more replicas)
Profile application for CPU hotspots

#Node Resource Pressure

Solutions:

Evict BestEffort Pods
Add more nodes
Reduce resource requests
Implement resource quotas

#Conclusion

In episode 29, we've explored Computational Resources in Kubernetes in depth. We've learned why resource requests and limits are critical, how they affect scheduling and runtime behavior, and best practices for resource management.

Key takeaways:

Always specify resources for predictable behavior
Requests guarantee minimum resources for scheduling
Limits prevent resource exhaustion and protect nodes
CPU is compressible - throttled when exceeded
Memory is incompressible - OOMKilled when exceeded
QoS classes determine eviction priority
Guaranteed QoS requires requests = limits
Burstable QoS allows flexible resource usage
BestEffort QoS should be avoided in production
Resource Quotas limit namespace resource usage
Limit Ranges set defaults and boundaries
Monitor actual usage before setting limits
Different workloads need different resource profiles
Requests affect scheduling, limits affect runtime
Use kubectl top to monitor resource usage

Proper resource management is fundamental to running stable, efficient Kubernetes clusters. By understanding and implementing resource requests and limits, you ensure predictable application behavior, efficient resource utilization, and cluster stability.

Are you getting a clearer understanding of Computational Resources in Kubernetes? Keep your learning momentum going and look forward to the next episode!