Kuberentes Scaling

Scaling means increasing or decreasing the number of Pods running your app.

Scale up = more Pods = handles more traffic
Scale down = fewer Pods = saves resources

Deployment YAML (Starting Point)

Before scaling anything, you have a deployment. The name you give it here is what you use in the scale command.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-app        # this name is what you reference in kubectl scale
spec:
  replicas: 1           # starts with 1 pod
  selector:
    matchLabels:
      app: node-app
  template:
    metadata:
      labels:
        app: node-app
    spec:
      containers:
      - name: node-app
        image: node-app:latest
        ports:
        - containerPort: 3000

The Scale Command

kubectl scale deployment node-app --replicas=4

This tells Kubernetes to run 4 Pods instead of 1. node-app here is the deployment name — whatever you set in metadata.name in your YAML.

Both styles below are valid and do the exact same thing. The slash is just a more explicit way of writing the resource type and name together. Use whichever you prefer, just stay consistent.

kubectl scale deployment node-app --replicas=4    # space style
kubectl scale deployment/node-app --replicas=4    # slash style

What Happens

Before scaling:

node-app-795677cc99-pzr2z   1/1   Running   (original)

After --replicas=4:

node-app-795677cc99-pzr2z   1/1   Running   (original)
node-app-795677cc99-abc1x   1/1   Running   (new)
node-app-795677cc99-def2y   1/1   Running   (new)
node-app-795677cc99-ghi3z   1/1   Running   (new)

Kubernetes starts 3 more Pods automatically and distributes traffic across all 4.