ETH Zurich | Spring 2019 | 252-0063-00L

Introductions & Objectives

Database systems are the backbone of the modern data-intensive applications and information systems which impact every single corner of our world and day-to-day lives. While most modern data-intensive applications are constructed using a diverse range of techniques such as data mining and machine learning, combined with more traditional data processing operations, a database almost always fill the role of making the data available and accessible in an efficient and robust way.

In this course, we will cover the basics of modeling, querying, and managing data using a relational database system. Throughout the semester, the students will "play" at two "roles" — (1) as a user of a relational database system, and (2) as a developer of a relational database engine.

When acting as users of a relational database system, we will cover how to use the system to build an application. The topics covered will include: the entity-relationship model, relational modeling, the relational data model, relational data modeling theory (normal forms), SQL, and referential integrity.

When acting as developers of a relational database engine, we will cover how a textbook relational database engine works to support a database user. The topics covered will include: query processing, query optimization, transactions, concurrency control, recovery, distributed and parallel query processing, replication, and distributed concurrency control.

We will also scratch the surface on several research topics that extend both the functionality and capacity of traditional relational databases. The goal in this is to help students get a sense of some of the researmetricsch being conducted by the database community and prepare students for more specialized courses or even a research career in data systems in the future. This includes: (1) data mining and the integration of machine learning in database systems, (2) the application of SQL outside an RDBMS such as Spark, DataFrame, and Django, (3) modern hardware like massively parallel processors, FPGAs, and decentralized systems such as Blockchain, (4) data processing operators that go beyond relational queries such as information integration and extraction, (5) database theory, and (6) other topics such as auto-tuning and learned index.

Class: When & Where & Who