Character Types

TODO: continue reading (Start from “Character Encoding in PostgreSQL") https://www.tigerdata.com/learn/what-characters-are-allowed-in-postgresql-strings

CHAR(n)

A fixed-length and blank-padded character type. Aka CHARACTER(n).

It also doesn’t store tailing space (heading space will still be stored), the tailing space will be trimmed.

Trim tailing space → Pad ASCII blanks until n charactors → Store

It can be used for: 1-char flags, country(ISO-3166 alpha-2)/region/language(ISO-639)/currency(ISO-4217) code

# The column_name will always **store precisely n characters (not bytes)**,
# even when the insertion charactors is less than n.
column_name CHAR(n)

# Without length specifier, it is the same as CHAR(1)
column_name CHAR

Why it is not common?

PostgreSQL has had no practical performance benefit for CHAR since PostgreSQL 7.x (early 2000s). The blank padding and tail trimming can be confusing.

LENGTH(), OCTET_LENGTH()

CREATE TABLE products (
  upc CHAR(12) NOT NULL
);

INSERT INTO
  products (upc)
VALUES
  ('194252697559'),
  ('194252714');

SELECT
  upc,
  LENGTH (upc),
  OCTET_LENGTH(upc)
FROM
  products;
  
  
     upc      | LENGTH | OCTET_LENGTH
--------------+--------+-------------
 194252697559 |   12   |     12
 194252714    |   9    |     12
 
 
 
# If the save non ASCII character, the behavour is tricky.
INSERT INTO products (upc) VALUES ('囧');

 LENGTH | OCTET_LENGTH
--------+-------------
   1    |     15

# The padding producure is by charactors, not bytes
# 1. “囧” is **a charactor with 3 bytes** in UTF-8
# 2. Padding adds **11 ASCII charactors** → 11 bytes
# 3. Total byte length = 3 + 11 = 15 bytes

CHECK Constraint

Since the type is blank-padded, so it can’t ensure the the character length.

Instead, use CHECK constraint (CHECK Constraint ) to limit.

upc CHAR(12) NOT NULL CHECK (length (upc) = 12)

VARCHAR(n)

Storing variable-length strings. Aka CHARACTER VARYING(n).