Typical constraints: Data type, Range, Mandatory/NOT NULL, Unique, Set membership / FK, Regex, Cross-field checks.
Think: phone number format, age 0–120, required customer_id
, unique license_plate
, state
∈ {US states/territories}.
Quick SQL:
-- range + required
WHERE age BETWEEN 0 AND 120 AND age IS NOT NULL
-- set membership (lookup table)
WHERE state IN (SELECT code FROM ref.us_states)
-- regex (format)
WHERE REGEXP_CONTAINS(phone, r'^\\(\\d{3}\\) \\d{3}-\\d{4}$')
You can’t “invent” missing facts later. Use explicit Unknown/Missing codes when allowed.
Profile for NULLs & blanks:
SELECT
COUNTIF(col IS NULL) AS nulls,
COUNTIF(TRIM(col) = '') AS blanks
FROM t;
CAST/SAFE_CAST
, FORMAT
, clear unit labels.Real-world keys: Phone → phone number, Email → address, Car → license plate.
Dedupe pattern (최신 1건만):
SELECT * EXCEPT(rn)
FROM (
SELECT *, ROW_NUMBER() OVER(
PARTITION BY customer_id ORDER BY updated_at DESC
) rn
FROM `proj.ds.customers`
)
WHERE rn = 1;