Give Jobsets an ID; add jobset_id to builds and jobs. by grahamc · Pull Request #710 · NixOS/hydra
RESULT: The latest-finished URL went from finishing in 10 minutes to about 500ms. PROBLEM: it made the existing indexes on Builds which used (project, jobset) useless and some queries took a long time, taking down the frontend. Adding another index (migration #65) fixed it.
tested query, which took 230s February 4, 2020id to the Jobsets table: serial, non-null, unique.
jobset_id to the Jobs table: nullable, foreign key to Jobsets
Jobs should begin writing the jobset_id
jobset_id to the Builds table: nullable, foreign key to Jobsets (3f074388)
Builds should begin writing the jobset_id
52283da9)Jobs with jobset_id values.
running this in a loop:
UPDATE jobs
SET jobset_id = (
  SELECT jobsets.id
  FROM jobsets
  WHERE jobsets.name = jobs.jobset
    AND jobsets.project = jobs.project
)
WHERE (jobs.project, jobs.jobset, jobs.name) in (
  SELECT jobsprime.project, jobsprime.jobset, jobsprime.name
  FROM jobs jobsprime
  WHERE jobsprime.jobset_id IS NULL
  FOR UPDATE SKIP LOCKED
  LIMIT 10000
);
Every N iterations, run VACUUM
VACUUM will prevent 2x table bloat from happening all at onceBuilds with jobset_id values.
running this in a loop:
UPDATE builds
SET jobset_id = (
  SELECT jobsets.id
  FROM jobsets
  WHERE jobsets.name = builds.jobset
    AND jobsets.project = builds.project
)
WHERE builds.id in (
  SELECT buildprime.id
  FROM builds buildprime
  WHERE buildprime.jobset_id IS NULL
  ORDER BY buildprime.id
  FOR UPDATE SKIP LOCKED
  LIMIT 10000
);
Every N iterations, run VACUUM
VACUUM will prevent 2x table bloat from happening all at oncejobset_id fields, this would indicate a bug which needs to be fixed.Builds table, doing two things in one transaction:
jobset_id, any rows with a null jobset_id is likely from a bug we should fix.Builds table to make  jobset_id not-null. Hopefully not very slow, I think postgresql will only validate no rows have a null jobset_id.