2.1 - What is Statistical Learning?

What is the general relationship between the quantitative response Y and p different predictors X1, X2, ..., Xp? What do the different parts mean?
What are the two main problems statistical learning solves?
Parametric vs non-parametric models?
What is the tradeoff between prediction accuracy and model interpretability?

2.2 - Assessing Model Accuracy

How do we measure quality of fit for regression problems?
What is the Bias-Variance tradeoff?
What is the goal when trying to find a model in terms of bias and variance?
How do we measure model accuracy in a classification problem?
What is a Bayes Classifier and why do we care about it?
Why can't we use the Bayes Classifier in all settings?
What values of KNN affect flexibility?

2.4 - Exercises

Inflexible or flexible method?
1. sample size n is large, number of predictors p is small - flexible method will be better because we have a lot of training data.
2. p is extremely large, n is small - inflexible method is better because we do not want high variance in the model from the small number of data points
3. relationship is highly non-linear - flexible methods will reduce bias and give an accurate model, but we must also have a lot of training samples to reduce variance.
4. variance of error terms is extremely high - inflexible methods are better because they have low variance and do not change too much from fluctuations in a training data point that might be caused due to the error.
Classification or Regression? Inference or prediction? n and p?
1. collect data on top 500 firms - profit, employees, industry and CEO salary. Trying to understand which factors affect CEO salary?
- Regression problem - CEO salary is a quantitative output variable
- Inference - trying to understand how the different predictors affect output
- n = 500
- p = 3 (CEO Salary is output variable and not a predictor)
1. new product either success or failure. collect data on 20 products - success or fail, price, marketing budget, competition price, and ten other variables
- Classification problem - trying to classify the new product
- Prediction problem - based on previous data, predict whether new product will succeed
- n = 20
- p = 13
1. predicting the % change in US dollar in relation to weekyl changes in world stock markets. Collect weekly data for all of 2012 - % change in dollar, % change US market, % change in British market, % change German market
- Regression problem
- Combination of prediction and inference - want to figure out the result as well as what causes it
- n = 52 weeks
- p = 3 (% change in dollar is the output variable)
bias-variance decomposition
1. draw out curves of Bias^2, Variance, Irreducible error, Training error, Testing error
2. explain
  1. Bias^2 decreases as model flexbility increases, because a more flexible model does not lose accuracy when estimating the real world with a model
  2. Variance increases as model flexibility increases because a more flexible model can produce different results on different training sets
  3. Irreducible error (Bayes error rate) is just a horizontal line that denotes the lower bound on the test error
  4. Test error = Bias^2 + Variance + Irreducible error
  5. Training error approaches 0 as the model gets more flexible and overfits to the training data.
real-life examples
1. classification - cancer or not based on blood test results, product is a success or failure based on other similar products, type of bread based on picture; email spam or not spam
2. regression - home prices based on location, etc.; how much weight somebody can lift based on height, weight; number of people at a concern based on prior concert tickets and location
3. cluster analysis - genre based on movie sales data, fashion trends based on price and stuff,