6.8: Preparing for Horizontal Pod Autoscaler (HPA)

https://drive.google.com/file/d/1Ky19-cY_NotsOW8j36pcgWtcfPOQA9qm/view?usp=sharing

🔍 YAML Breakdown

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-autoscaler
spec:
  selector:
    matchLabels:
      run: k8s-autoscaler
  replicas: 2
  template:
    metadata:
      labels:
        run: k8s-autoscaler
    spec:
      containers:
      - name: k8s-autoscaler
        image: lovelearnlinux/webserver:v1
        ports:
        - containerPort: 80
        resources:
          requests:            # ← REQUIRED for HPA
            cpu: "200m"
            memory: "128Mi"
          limits:
            cpu: "500m"
            memory: "256Mi"
---
apiVersion: v1
kind: Service
metadata:
  name: k8s-autoscaler
  labels:
    run: k8s-autoscaler
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    run: k8s-autoscaler

🔑 Key Insight:

HPA requires resources.requests — without it, autoscaling won’t work!

🎯 Why?

HPA calculates % utilization as:
(Current CPU Usage) / (CPU Request)
If no request is set → denominator = 0 → undefined behavior.

📌 How HPA Works (CPU-Based)

Metric	Formula	Your Values
Target CPU Utilization	e.g., 50%	Configured in HPA
Current CPU Usage	Measured by Metrics Server	e.g., 150m
CPU Request	From Pod spec	`200m`
Current Utilization	`150m / 200m = 75%`	> 50% → scale up!

📈 Scaling Logic:

If average CPU > target → add replicas

If average CPU < target → remove replicas (after cooldown)

🧪 k3s Lab: Deploy + Autoscale

✅ Prerequisite: Metrics Server must be installed in your k3s cluster.

🔧 Step 0: Verify Metrics Server

# Install if missing (k3s doesn't include it by default)
kubectl apply -f <https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml>

# Verify
kubectl get pods -n kube-system | grep metrics-server
kubectl top nodes
kubectl top pods  # Should show CPU/memory

🔧 Step 1: Deploy the Application

# Apply Deployment + Service
kubectl apply -f deployment-for-autoscaler.yaml

# Verify
kubectl get pods -l run=k8s-autoscaler
kubectl top pods  # Should show CPU usage

🔧 Step 2: Create HPA (CPU-Based)

💡 You’ll need hpa-for-autoscaler-deployment.yaml next, but let’s create a basic HPA now:

# Create HPA: target 50% CPU, min 2, max 5 replicas
kubectl autoscale deployment k8s-autoscaler \\\\
  --cpu-percent=50 \\\\
  --min=2 \\\\
  --max=5

# Verify HPA
kubectl get hpa
# NAME              REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS
# k8s-autoscaler    Deployment/k8s-autoscaler   0%/50%    2         5         2

🔍 YAML Breakdown

📌 How HPA Works (CPU-Based)

🧪 k3s Lab: Deploy + Autoscale

🔧 Step 0: Verify Metrics Server

🔧 Step 1: Deploy the Application

🔧 Step 2: Create HPA (CPU-Based)

🔧 Step 3: Generate CPU Load