This condensed syllabus progresses from basics to advanced AI research, following Karpathy's playlist with minor flow tweaks. Format: 2-hour watch and ask sessions.
Quick Refreshers on Core Concepts
- Regression: Predict continuous values; fits curves to data (e.g., community scores).
- Backpropagation: Trains nets by backward gradients; adjusts weights.
- Deep Learning: Hierarchical feature learning in multi-layer nets.
- Machine Learning: Algorithms learning from data; supervised/unsupervised/reinforcement.
Block 1: Neural Network Fundamentals
- ~~The spelled-out intro to neural networks and backpropagation: building micrograd (2:25:52)Key: Auto-diff, loss—interpret decisions. Additional: Linear algebra; calculus basics.~~
Block 2: Language Modeling Basics
- ~~The spelled-out intro to language modeling: building makemore (1:57:45)Key: Bigrams, sampling—model identities.Additional: Probability; loss functions.~~
Day 3 end time 1 hour 2 minutes (part 2)
Day 4 finished
Block 3: Scaling to Multi-Layer Networks
- ~~Building makemore Part 2: MLP (1:15:40) – Non-linearities.~~
- ~~Building makemore Part 3: Activations & Gradients, BatchNorm (1:55:58) – Stabilize training.~~
- ~~Building makemore Part 4: Becoming a Backprop Ninja (1:55:24) – Chain rule.Additional: Optimization (SGD); regression practice.~~
day 7 (1 hour mark)
day 8 finished
- ~~State of GPT | BRK216HFS (42:40) – Challenges.Additional: Pretraining strategies; fine-tuning.~~
Block 4: Advanced Sequence Modeling
- ~~Building makemore Part 5: Building a WaveNet (56:22) – Long-range deps.Additional: CNN basics; graph intros.~~
Block 5: Transformer Architecture
- ~~Let's build GPT: from scratch, in code, spelled out. (1:56:20) – Self-attention.Additional: Multi-head attention; BERT vs. GPT.~~
day 11 1 hour mark
day 12 finished
Block 6: Tokenization and Data Handling