This note goes hand in hand with the noteEXPLAIN & Planner Behavior.

Where/How PostgreSQL stores the data

SHOW data_directory shows the directory that PostgreSQL stores database files, configurations, and logs.

SHOW data_directory;

      data_directory      
--------------------------
 /var/lib/postgresql/data
(1 row)
root@bf13a0080500:/var/lib/postgresql/data# ls -al
total 136
drwx------ 19 postgres postgres  4096 Dec 26 04:21 .
drwxrwxrwt  1 postgres postgres  4096 Nov  4 00:28 ..
drwx------  6 postgres postgres  4096 Dec 13 12:23 base
drwx------  2 postgres postgres  4096 Dec 26 04:22 global
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_commit_ts
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_dynshmem
-rw-------  1 postgres postgres  5743 Dec 13 12:23 pg_hba.conf
-rw-------  1 postgres postgres  2640 Dec 13 12:23 pg_ident.conf
drwx------  4 postgres postgres  4096 Dec 25 14:05 pg_logical
drwx------  4 postgres postgres  4096 Dec 13 12:23 pg_multixact
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_notify
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_replslot
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_serial
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_snapshots
drwx------  2 postgres postgres  4096 Dec 26 04:21 pg_stat
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_stat_tmp
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_subtrans
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_tblspc
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_twophase
-rw-------  1 postgres postgres     3 Dec 13 12:23 PG_VERSION
drwx------  4 postgres postgres  4096 Dec 18 08:18 pg_wal
drwx------  2 postgres postgres  4096 Dec 13 12:23 pg_xact
-rw-------  1 postgres postgres    88 Dec 13 12:23 postgresql.auto.conf
-rw-------  1 postgres postgres 30777 Dec 13 12:23 postgresql.conf
-rw-------  1 postgres postgres    36 Dec 26 04:21 postmaster.opts
-rw-------  1 postgres postgres    94 Dec 26 04:21 postmaster.pid

Inside base/, each subdirectory name is a database OID (object ID).

Each of those numbers corresponds to one database in the cluster.

SELECT oid, datname FROM pg_database;
-- OID → folder name
-- datname → human name (postgres, app_db, etc.)

  oid  |    datname    
-------+---------------
     5 | postgres
 16384 | inventory
     1 | template1
     4 | template0
(4 rows)
root@bf13a0080500:/var/lib/postgresql/data/base# ls -al
total 60
drwx------  6 postgres postgres  4096 Dec 13 12:23 .
drwx------ 19 postgres postgres  4096 Dec 26 04:21 ..
drwx------  2 postgres postgres  4096 Dec 26 04:22 1
drwx------  2 postgres postgres 36864 Dec 26 04:22 16384
drwx------  2 postgres postgres  4096 Dec 13 12:23 4
drwx------  2 postgres postgres  4096 Dec 26 04:21 5

Find the oid of the tables in the inventory database.

/.../data/base/<db_oid>/<table_oid>
SELECT
  c.oid,
  c.relname
FROM
  pg_class c
  JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE
  c.relkind = 'r'
  AND n.nspname NOT IN ('pg_catalog', 'information_schema');
  
  
  oid  |      relname
-------+-------------------
 46908 | users
 46917 | profiles
 46936 | categories
 46928 | brands
 46949 | products
 46977 | product_tags
 46970 | tags
 46999 | inventories
 46993 | warehouses
 47022 | transactions
 47429 | product_groups  

PostgreSQL stores data of a table using a heap file storage model. Each table corresponds to a heap file.

In PostgreSQL, heap file does not mean a heap data structure like the one you use in algorithms (binary heap, priority queue). The word heap here is historical database jargon.

For example, PostgreSQL stores the data of the products table in a heap file with the name 46949.

# /.../data/base/<db_oid>/<table_oid>
/.../data/base/46769/46949

Each heap file consists of blocks (or pages). Typically, a block has a size of 8KB by default. Each block contains multiple tuples (or rows).

When retrieving data from a table, PostgreSQL has to read data from the heap file and load it to the memory for filtering the rows.

B-Tree Index

B-Tree is the default index. The B stands for Balanced, and the the amount of data on both sides of the tree is roughly the same. B-Tree indexes can be used for equality and range queries efficiently. They can operate against all datatypes, and can also be used to retrieve NULL values. B-Trees are designed to work very well with caching, even when only partially cached.