Practice makes understanding faster: ‣

`JOIN`, `INNER JOIN`

Source: https://www.pgtutorial.com/postgresql-tutorial/postgresql-inner-join/

JOIN brands ON brands.id = products.brand_id

If some brands don’t have any corresponding products, or some products reference a non-existent brand, those rows will not appear in the result set. This is useful for Technique: Filter Out Invalid Data.

Source: Source: https://www.pgtutorial.com/postgresql-tutorial/postgresql-inner-join/

For multiple INNER JOINs, if one mismatch, the row is dropped.

FROM track
JOIN genre  ON track.genre_id  = genre.id
JOIN album  ON track.album_id  = album.id
JOIN artist ON album.artist_id = artist.id;

track row
   ├── matches genre? ── no → drop row
   ├── matches album? ── no → drop row
   ├── matches artist? ── no → drop row
   └── all matched → row included

SQL Planner: Prioritizing Small Tables for Multi INNER JOIN

By joining smaller tables first, it drops more mismatching data, and the dataset for the following INNER JOIN will be more efficient. But we don’t need to worry about our SQL written order; the modern SQL planner will help us decide the execution order.

Technique: Filter Out Invalid Data

Scenario: Customers and Orders

For data analysis reason, orders remain in the database even if the customer record is deleted or marked inactive
We want to show only orders with an existing and active customer for user application

The purpose of INNER JOIN here is only for filtering, no data merging.

Practice: ⭐️ (Medium) 1174. Immediate Food Delivery II

-- Return orders with a valid (and active) customer
SELECT o.*
FROM orders AS o
JOIN customers AS c
  ON c.id = o.customer_id
WHERE c.deleted_at IS NULL;  -- or c.status = 'active'

VS. `EXISTS`and `IN` Subqueries

SELECT o.*
FROM orders o
WHERE EXISTS (
    SELECT 1
    FROM customers c
    WHERE c.id = o.customer_id
      AND c.deleted_at IS NULL  -- or c.status = 'active'
);

-- ⚠️ IN is slower
-- Materializes the entire subquery result set first (loads all valid customer IDs into memory)
-- No short-circuit optimization

SELECT o.*
FROM orders o
WHERE o.customer_id IN (
    SELECT c.id
    FROM customers c
    WHERE c.deleted_at IS NULL  -- or c.status = 'active'
);

Similar Concept:

JOIN, INNER JOIN

SQL Planner: Prioritizing Small Tables for Multi INNER JOIN

Technique: Filter Out Invalid Data

VS. EXISTSand IN Subqueries

`JOIN`, `INNER JOIN`

VS. `EXISTS`and `IN` Subqueries