πŸ‘‰ List of all notes for this book. IMPORTANT UPDATE November 18, 2024: I've stopped taking detailed notes from the book and now only highlight and annotate directly in the PDF files/book. With so many books to read, I don't have time to type everything. In the future, if I make notes while reading a book, they'll contain only the most notable points (for me).


<aside> πŸ“” Jupyter notebook for this chapter: on Github, on Colab, on Kaggle.

</aside>

Main steps

In this chapter you will work through an example project end to end.

Working with Real Data

In this chapter, we use California Housing Prices dataset (or download it from the author’s repository).

Fig 2-1. California housing prices

Fig 2-1. California housing prices

This data includes metrics as the population, median income, median housing price for each block group (called β€œdistrict” for short).

Look at the Big Picture

Your model should learn from this data β†’ predict the median housing price in any district.

<aside> ☝ You should pull out this ML project checklist (Appendix A in the book) for each project.

</aside>

Frame the Problem

Ask questions to find the methods.

<aside> ☝ Pipeline = a sequence of data processing components is called a data pipeline*.*** Each component is handled by a team. The whole process is robust.

</aside>