Storage — Volumes, PV, PVC, StorageClasses

Containers are temporary by design. Everything written inside a container's filesystem disappears the moment that container restarts or gets replaced. For stateless apps like a web server this is fine — they do not write anything important to disk. But for databases like MongoDB, PostgreSQL, or MySQL, data must survive outside the container. That is what Kubernetes storage is for.

Why This Problem Exists

Without any volume:

  MongoDB Pod starts -> writes data to /data/db inside container
  Pod crashes -> container deleted -> /data/db is gone
  New Pod starts -> empty database -> all data lost

With persistent storage:

  MongoDB Pod starts -> writes data to /data/db
  /data/db is mapped to a folder on the NODE (outside the container)
  Pod crashes -> container deleted -> folder on node stays intact
  New Pod starts -> mounts same folder -> all data still there

The key idea: data must live outside the container, on something that survives the container's lifecycle.

Three Levels of Storage

Level 1 — Inside the Pod (emptyDir)
  Tied to the Pod's lifecycle
  Pod deleted = volume deleted = data gone
  Only useful for sharing data between two containers in the same Pod

Level 2 — Outside the Pod, inside the Cluster (PersistentVolume)
  Exists independently of any Pod
  Pod dies, restarts, gets replaced — storage stays
  Data survives as long as the cluster exists

Level 3 — Outside the Cluster (Cloud Storage)
  AWS EBS, GCP Persistent Disk, Azure Disk, EFS
  Survives even if the entire cluster is destroyed
  What real production systems use

emptyDir — The Simplest Volume

emptyDir is a temporary volume that lives as long as the Pod lives. It is created empty when the Pod starts and deleted when the Pod is deleted — but it survives container restarts inside the same Pod.

spec:
  volumes:
  - name: shared-data
    emptyDir: {}              # created empty, lives with the Pod

  containers:
  - name: my-app
    volumeMounts:
    - name: shared-data
      mountPath: /data        # container writes here
When is emptyDir useful?
  Two containers in the same Pod need to share files
  Example: a web server container + a log shipper container
           both mount the same emptyDir at /logs
           web server writes logs, shipper reads and sends them

When emptyDir is NOT enough:
  Pod is deleted -> emptyDir gone -> data lost
  You need storage that outlives the Pod entirely

PersistentVolume (PV) — The Actual Storage

A PersistentVolume is a piece of storage provisioned in the cluster that exists completely independently of any Pod. When a Pod is deleted, the PV stays. When a new Pod starts, it can mount the same PV and get all the data back.

Think of a PV as a storage unit that you have reserved — it sits there waiting to be used, and it keeps everything inside it safe regardless of what happens to the Pods.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi                         # how much storage this PV offers
  accessModes:
  - ReadWriteOnce                         # how it can be mounted (see table below)
  persistentVolumeReclaimPolicy: Retain   # what to do with data after PVC is deleted
  storageClassName: standard              # must match the PVC that claims this PV
  hostPath:
    path: /data/my-pv                     # folder on the node (local/testing only)
    type: DirectoryOrCreate               # create folder if it doesn't exist

Access Modes — How the Volume Can Be Mounted

Mode Short Name Meaning Use For
ReadWriteOnce RWO One node can read and write Databases (single instance)
ReadOnlyMany ROX Many nodes can read, none can write Shared config files
ReadWriteMany RWX Many nodes can read and write Shared file storage (needs EFS or NFS)

Most databases use ReadWriteOnce — only one node should be writing to the database at a time.

PV Status