 Computational Data Analysis at Technical University Of Denmark | Flashcards & Summaries

Suggested languages for you: # Lernmaterialien für computational data analysis an der Technical University of Denmark

Greife auf kostenlose Karteikarten, Zusammenfassungen, Übungsaufgaben und Altklausuren für deinen computational data analysis Kurs an der Technical University of Denmark zu.

TESTE DEIN WISSEN
Variance
• what
Lösung anzeigen
TESTE DEIN WISSEN
What: Distance of a single prediction from the true value.
Lösung ausblenden
TESTE DEIN WISSEN

OLS (ordinary least squares)

• what
• how
Lösung anzeigen
TESTE DEIN WISSEN

what: Loss funciton

how: takes the sum of squares of prediction error.

Lösung ausblenden
TESTE DEIN WISSEN

EPE (expected prediction error)

• what
Lösung anzeigen
TESTE DEIN WISSEN
• what
• sum 3 things:
• irreducible error (noise)2
• bias2
• variance
Lösung ausblenden
TESTE DEIN WISSEN

Ridge regression L2 (shrinkage)

• what
• how
• why
Lösung anzeigen
TESTE DEIN WISSEN

What: introduces some bias to the training data

How: to the equation that the linear regression calculates adds a penalty to the slope2 to the loss function. Lambda value determines its severity.

Why: to reduce the variance on the test data.

Lösung ausblenden
TESTE DEIN WISSEN

KNearest Neighbor (KNN) classification

• what
• how
Lösung anzeigen
TESTE DEIN WISSEN

What: ML model

How: counts the K amount of nearest neighbors next to the item to predict.

Why:

Lösung ausblenden
TESTE DEIN WISSEN

KNearest Neighbor (KNN) regression

• what
• how
Lösung anzeigen
TESTE DEIN WISSEN

What: ML model

How: calculate the number based on the average distance to the K neighbors

Why

Lösung ausblenden
TESTE DEIN WISSEN

Unsupervised learning

• what
• how
• why
Lösung anzeigen
TESTE DEIN WISSEN

What: Category to put ML models into

How: Doesnt have a label to compare correctness of prediction.

Why: used for feature generation, outlier detection, finding hidden patterns

Lösung ausblenden
TESTE DEIN WISSEN

• what
• how
Lösung anzeigen
TESTE DEIN WISSEN

What: Term to describe that the training data may not represent the real patterns present,

How: introducing bias may reduce overall variance.

Why

Lösung ausblenden
TESTE DEIN WISSEN

Validation set

• what
• how
• why
Lösung anzeigen
TESTE DEIN WISSEN

what: a separated segment of the data. where the hyperparameter tuning takes place.

how: on the training set many models with different hyperparameters are trained, and based on the validation set, the best is chosen to perform on the test set.

why: To choose the best hyperparameters.

Lösung ausblenden
TESTE DEIN WISSEN

Cross validation

• what
• how
• why
Lösung anzeigen
TESTE DEIN WISSEN

what: a method to separate data to be able to do hyperparameter tuning

how: data is separated into k chunks, then, 1 is a validation set, rest is train. for each model loop until each set was a validation set.

why: If there is not enough data, then one would do this to ensure proper generalization.

Lösung ausblenden
TESTE DEIN WISSEN

1 standard error rule

• what
• how
Lösung anzeigen
TESTE DEIN WISSEN

what: a best practice for cross validation

how: get the model that has the least error, then return the model which is the simplest and is one standard error worse.

why: ?

Lösung ausblenden
TESTE DEIN WISSEN
Bias
• what
Lösung anzeigen
TESTE DEIN WISSEN
what: Average difference of predicted value from true value
Lösung ausblenden • 609 Karteikarten
• 74 Studierende
• 0 Lernmaterialien

## Beispielhafte Karteikarten für deinen computational data analysis Kurs an der Technical University of Denmark - von Kommilitonen auf StudySmarter erstellt!

Q:
Variance
• what
A:
What: Distance of a single prediction from the true value.
Q:

OLS (ordinary least squares)

• what
• how
A:

what: Loss funciton

how: takes the sum of squares of prediction error.

Q:

EPE (expected prediction error)

• what
A:
• what
• sum 3 things:
• irreducible error (noise)2
• bias2
• variance
Q:

Ridge regression L2 (shrinkage)

• what
• how
• why
A:

What: introduces some bias to the training data

How: to the equation that the linear regression calculates adds a penalty to the slope2 to the loss function. Lambda value determines its severity.

Why: to reduce the variance on the test data.

Q:

KNearest Neighbor (KNN) classification

• what
• how
A:

What: ML model

How: counts the K amount of nearest neighbors next to the item to predict.

Why:

Q:

KNearest Neighbor (KNN) regression

• what
• how
A:

What: ML model

How: calculate the number based on the average distance to the K neighbors

Why

Q:

Unsupervised learning

• what
• how
• why
A:

What: Category to put ML models into

How: Doesnt have a label to compare correctness of prediction.

Why: used for feature generation, outlier detection, finding hidden patterns

Q:

• what
• how
A:

What: Term to describe that the training data may not represent the real patterns present,

How: introducing bias may reduce overall variance.

Why

Q:

Validation set

• what
• how
• why
A:

what: a separated segment of the data. where the hyperparameter tuning takes place.

how: on the training set many models with different hyperparameters are trained, and based on the validation set, the best is chosen to perform on the test set.

why: To choose the best hyperparameters.

Q:

Cross validation

• what
• how
• why
A:

what: a method to separate data to be able to do hyperparameter tuning

how: data is separated into k chunks, then, 1 is a validation set, rest is train. for each model loop until each set was a validation set.

why: If there is not enough data, then one would do this to ensure proper generalization.

Q:

1 standard error rule

• what
• how
A:

what: a best practice for cross validation

how: get the model that has the least error, then return the model which is the simplest and is one standard error worse.

why: ?

Q:
Bias
• what
A:
what: Average difference of predicted value from true value ### Erstelle und finde Lernmaterialien auf StudySmarter.

Greife kostenlos auf tausende geteilte Karteikarten, Zusammenfassungen, Altklausuren und mehr zu.

## Das sind die beliebtesten computational data analysis Kurse im gesamten StudySmarter Universum

##### data analysis in biology

University of Zürich

##### Data Analysis

Babcock University

##### Vibration analysis

University of Manchester

##### applied data analysis

Leiden University

##### Engineering Data Analysis

University of San Carlos

## Die all-in-one Lernapp für Studierende

##### Greife auf Millionen geteilter Lernmaterialien der StudySmarter Community zu ##### Erstelle Karteikarten und Zusammenfassungen mit den StudySmarter Tools 