Containers are temporary by design. Everything written inside a container's filesystem disappears the moment that container restarts or gets replaced. For stateless apps like a web server this is fine — they do not write anything important to disk. But for databases like MongoDB, PostgreSQL, or MySQL, data must survive outside the container. That is what Kubernetes storage is for.
Without any volume:
MongoDB Pod starts -> writes data to /data/db inside container
Pod crashes -> container deleted -> /data/db is gone
New Pod starts -> empty database -> all data lost
With persistent storage:
MongoDB Pod starts -> writes data to /data/db
/data/db is mapped to a folder on the NODE (outside the container)
Pod crashes -> container deleted -> folder on node stays intact
New Pod starts -> mounts same folder -> all data still there
The key idea: data must live outside the container, on something that survives the container's lifecycle.
Level 1 — Inside the Pod (emptyDir)
Tied to the Pod's lifecycle
Pod deleted = volume deleted = data gone
Only useful for sharing data between two containers in the same Pod
Level 2 — Outside the Pod, inside the Cluster (PersistentVolume)
Exists independently of any Pod
Pod dies, restarts, gets replaced — storage stays
Data survives as long as the cluster exists
Level 3 — Outside the Cluster (Cloud Storage)
AWS EBS, GCP Persistent Disk, Azure Disk, EFS
Survives even if the entire cluster is destroyed
What real production systems use
emptyDir is a temporary volume that lives as long as the Pod lives. It is created empty when the Pod starts and deleted when the Pod is deleted — but it survives container restarts inside the same Pod.
spec:
volumes:
- name: shared-data
emptyDir: {} # created empty, lives with the Pod
containers:
- name: my-app
volumeMounts:
- name: shared-data
mountPath: /data # container writes here
When is emptyDir useful?
Two containers in the same Pod need to share files
Example: a web server container + a log shipper container
both mount the same emptyDir at /logs
web server writes logs, shipper reads and sends them
When emptyDir is NOT enough:
Pod is deleted -> emptyDir gone -> data lost
You need storage that outlives the Pod entirely
A PersistentVolume is a piece of storage provisioned in the cluster that exists completely independently of any Pod. When a Pod is deleted, the PV stays. When a new Pod starts, it can mount the same PV and get all the data back.
Think of a PV as a storage unit that you have reserved — it sits there waiting to be used, and it keeps everything inside it safe regardless of what happens to the Pods.
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 10Gi # how much storage this PV offers
accessModes:
- ReadWriteOnce # how it can be mounted (see table below)
persistentVolumeReclaimPolicy: Retain # what to do with data after PVC is deleted
storageClassName: standard # must match the PVC that claims this PV
hostPath:
path: /data/my-pv # folder on the node (local/testing only)
type: DirectoryOrCreate # create folder if it doesn't exist
| Mode | Short Name | Meaning | Use For |
|---|---|---|---|
| ReadWriteOnce | RWO | One node can read and write | Databases (single instance) |
| ReadOnlyMany | ROX | Many nodes can read, none can write | Shared config files |
| ReadWriteMany | RWX | Many nodes can read and write | Shared file storage (needs EFS or NFS) |
Most databases use ReadWriteOnce — only one node should be writing to the database at a time.