BigQuery Performance

Four Key Elements of Work

I/O — how many bytes read? (읽기 바이트 수)
Shuffle — bytes passed to next stage;
- Grouping — bytes per group (그룹별 바이트)
Materialization — bytes written to storage (쓰기 바이트 수)
CPU work — UDFs & built-in functions (CPU 사용량)

Avoid I/O Waste

Don’t SELECT * → select only needed columns.
Push down filters early with WHERE.
Avoid ORDER BY without LIMIT.

Prevent Data Hotspots

Shuffle wisely

Filter early to avoid overloading workers on JOIN.
Use Query Explanation map → compare Max vs Avg stage times to spot skew.
BigQuery auto-reshuffles overloaded workers (자동 재셔플링).

GROUP BY

Best when distinct groups are small.
Bad: grouping by high-cardinality unique IDs.

Joins & Unions

Know join key uniqueness → avoid accidental cross joins.