https://drive.google.com/file/d/16PZAh-epESLHXVZ7VUU0rywuNOes5S-l/view?usp=sharing

๐Ÿ” YAML Breakdown

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
meta
  name: nginx-deployment-hpa
spec:
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 30    # โ† Wait 30s before scaling down
      policies:
      - type: Pods
        value: 1                        # โ† Remove max 1 Pod every 30s
        periodSeconds: 30
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-deployment
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 10          # โ† Scale when CPU > 10% of request

๐Ÿ”‘ Key Insight:

The behavior section lets you control the speed and stability of scaling operations.


๐Ÿ“Œ Scaling Policies Explained

1. scaleDown Configuration

Field Meaning Your Value Effect
stabilizationWindowSeconds: 30 Wait before scaling down 30s Prevents flapping during brief load drops
policies[0].type: Pods Scale by absolute Pod count Pods Remove 1 Pod per interval
value: 1 Max Pods to remove 1 Slow, controlled scale-down
periodSeconds: 30 Interval between scale actions 30s One Pod removed every 30 seconds

๐ŸŽฏ Scale-Down Flow:

2. Missing scaleUp?

๐Ÿ’ก Your YAML only defines scaleDown โ†’ scaleUp uses default behavior:

โœ… To control scale-up, add:

behavior:
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Percent
      value: 100    # โ† Double replicas every 60s
      periodSeconds: 60
  scaleDown:
    ...


๐Ÿงช k3s Lab: Observe Scaling Policies in Action

๐Ÿ”ง Step 1: Deploy the Full Stack

# Save as deployment-hpa-with-policies.yaml
kubectl apply -f deployment-hpa-with-policies.yaml

# Verify
kubectl get hpa nginx-deployment-hpa
kubectl get pods -l app=nginx

๐Ÿ”ง Step 2: Generate Load (Trigger Scale-Up)

# Start aggressive load (CPU > 10% of 80m = 8m)
kubectl run loader --image=busybox --restart=Never -it -- sh

# Inside shell:
while true; do wget -q -O- <http://nginx-deployment-svc>; done

๐Ÿ” Watch scale-up (should be fast โ€” no policies defined):

kubectl get hpa -w
# Replicas jump from 3 โ†’ 10 quickly

๐Ÿ”ง Step 3: Stop Load (Trigger Scale-Down)