What is the main goal of supervised learning?
a. To cluster unlabeled data
b. To learn a function that maps input to output using labeled data
c. To find the shortest path in a graph
d. To simulate human intelligence
Which of the following is an example of regression?
a. Spam detection
b. Predicting house prices
c. Image classification
d. Sentiment analysis
What does the loss function measure?
a. Model complexity
b. Amount of data required
c. How well the model's predictions match the actual labels
d. Number of neurons in the network
What is the name of the function used to update weights during training?
a. Normalization
b. Softmax
c. Gradient descent
d. Backpropagation
True or False:Â Lower loss always means better generalization performance.
Fill in the blank:Â A model that performs well on training data but poorly on test data is said to be __________.
Which of the following is not a type of overfitting prevention method?
a. Dropout
b. Cross-validation
c. Increasing model complexity
d. Regularization
What is the purpose of the validation set?
a. To test the model one final time
b. To tune hyperparameters
c. To train the model
d. To visualize feature correlations
True or False:Â A large gap between training and validation loss is a sign of underfitting.
Short Answer:Â What does it mean if a feature has high predictive power?
In a linear regression model, what do weights (also called coefficients) represent?
a. The residual error
b. The slope of the input features
c. The activation threshold
d. The bias term
What is the difference between loss and error?
a. No difference; they are interchangeable
b. Loss is averaged; error is binary
c. Loss applies only to regression
d. Loss is differentiable; error is not necessarily
Which of these methods is used to reduce learning rate dynamically during training?
a. Early stopping
b. Learning rate decay
c. Overfitting
d. One-hot encoding
Which of the following can cause gradient descent to converge slowly or erratically?
a. Small dataset
b. Improper learning rate
c. Feature normalization
d. Large number of epochs
What does L2 regularization do?
a. Penalizes large weights
b. Encourages sparsity
c. Increases the number of features
d. Adds noise to the model
What effect does increasing the regularization strength (λ) typically have?
a. More overfitting
b. Lower training loss
c. Simpler model with smaller weights
d. Larger weights and higher variance
What is the bias term in a linear model responsible for?
a. Normalizing data
b. Avoiding zero gradients
c. Shifting the prediction function vertically
d. Increasing dimensionality