What is supervised learning?
Training samples, with their corresponding targets
--> find a function f that generalizes this relationship!
--> using f, make test predictions a different set of test data
What is Classification?
A subclass of supervised learning
--> the targets represent certain categories
What is Regression?
A subclass of supervised learning
--> The targets represent continuous numbers!
What are typical tasks in unsupervised learning?
• Clustering
• Dimensionality Reduction
• Generative Modeling
• Topic Models
Much more but likely not necessary for the exam
What is unsupervised learning?
Find structure in unlabeled data
• Clustering
• Dimensionality reduction
What is one of the first things to do when you try to analyse new data?
Visualize them (scattee plots, histograms, ...)
What is the hard part of decision trees?
Learning them. Using them is easy and much faster than KNNs for example.

Building the optimal tree is NP-complete!
How can we decide how many neighbours are best for our KNN?
Split your data into training, validation and Test Set!
Goal: generalize and pick k so that KNN performs best for unseen data
1. Learn model using the training set
2. Evaluate performance for different ks using the Validation set
3. Report final performance on the test set
How can we improve the generalization of a 1NN?
Look at the k-nearest-neighbours & majority vote --> KNN
What is the F1 Score?
A score to measure the performance of our model.
The harmonic mean of recall and precision

f1 = (2 * prec * rec)/(prec + rec)
What is the intuitive core idea of decision trees?
20-Question-Game
What can we do instead of just testing all possible combinationd of a decision tree?
Use greedy, local decisions. What is the best single local decision I can do at this point?
