Variance
• what
What: Distance of a single prediction from the true value.
OLS (ordinary least squares)

• what
• how
what: Loss funciton

how: takes the sum of squares of prediction error.

EPE (expected prediction error)

• what
• what
• sum 3 things:
• irreducible error (noise)2
• bias2
• variance
Ridge regression L2 (shrinkage)

• what
• how
• why
What: introduces some bias to the training data

How: to the equation that the linear regression calculates adds a penalty to the slope2 to the loss function. Lambda value determines its severity.

Why: to reduce the variance on the test data.

KNearest Neighbor (KNN) classification

• what
• how
What: ML model

How: counts the K amount of nearest neighbors next to the item to predict.

Why:

KNearest Neighbor (KNN) regression

• what
• how
What: ML model

How: calculate the number based on the average distance to the K neighbors

Why

Unsupervised learning

• what
• how
• why
What: Category to put ML models into

How: Doesnt have a label to compare correctness of prediction.

Why: used for feature generation, outlier detection, finding hidden patterns

• what
• how
What: Term to describe that the training data may not represent the real patterns present,

How: introducing bias may reduce overall variance.

Why

Validation set

• what
• how
• why
what: a separated segment of the data. where the hyperparameter tuning takes place.

how: on the training set many models with different hyperparameters are trained, and based on the validation set, the best is chosen to perform on the test set.

why: To choose the best hyperparameters.

Cross validation

• what
• how
• why
what: a method to separate data to be able to do hyperparameter tuning

how: data is separated into k chunks, then, 1 is a validation set, rest is train. for each model loop until each set was a validation set.

why: If there is not enough data, then one would do this to ensure proper generalization.

1 standard error rule

• what
• how
what: a best practice for cross validation

how: get the model that has the least error, then return the model which is the simplest and is one standard error worse.

why: ?

Bias
• what
what: Average difference of predicted value from true value
