I am following along the CMU Advanced Databases course for the Spring of 2024. I am watching every lecture, reading every (mandatory) paper, and writing paper reviews as well as lecture notes. Finally, I also would like to somehow work on the project assignment, but finding time for this will be very hard.

This course is primarily focused on building large scale OLAP systems, not so much on scaling OLTP systems. The majority of the content is around big data file formats, execution engines, query optimizers and schedulers.



My Lecture Notes

Reading Reviews

This is my list of reading reviews, in the template required by this class. For each lecture, there is a mandatory paper that needs to be reviewed. There are then also some other papers that should be read, but a review is not required.

Lecture #01 (January 22th 2024)

Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics (1)

The Composable Data Management System Manifesto

Lecture #02 (January 24th 2024)

An Empirical Evaluation of Columnar Storage Formats

Lecture #03 (January 29th 2024)

The FastLanes Compression Layout: Decoding > 100 Billion Integers per Second with Scalar Code

Lecture #04 (February 5th 2024)

MonetDB/X100: Hyper-Pipelining Query Execution

Lecture #05 (February 7th 2024)

Velox: Meta’s Unified Execution Engine

Lecture #06 (February 12th 2024)

Make the Most out of Your SIMD Investments: Counter Control Flow Divergence in Compiled Query Pipelines

Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask

Lecture #07 (February 14th 2024)

Efficiently Compiling Efficient Query Plans for Modern Hardware

Lecture #08 (February 19th 2024)

Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age

Self-Tuning Query Scheduling for Analytical Workloads

Lecture #09 (February 24th 2024)

An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory

Lecture #10 (February 26th 2024)

Adopting Worst-Case Optimal Joins in Relational Database Systems

Lecture #11 (March 11th 2024)

Froid: Optimization of Imperative Programs in a Relational Database

Lecture #12 (March 13th 2024)

Don’t Hold My Data Hostage – A Case For Client Protocol Redesign

Why am I doing this?

Course Reference & Materials

Reading Review Template

### Overview of the main idea (3 sentences)

...

### Key findings / takeaways from the paper (2-3 sentences)

...

### System used in evaluation and how it was modified/extended (1 sentence)

...

### Workload Evaluated (1 sentence)

...