Your peers in the course Pattern Recognition at the TU München create and share summaries, flashcards, study plans and other learning materials with the intelligent StudySmarter learning app.

Get started now!

Pattern Recognition

How is the accuracy defined?

accuracy = (TP + TN)/(TP +FP + TN +FN)

Pattern Recognition

What does specificity describe? How is defined?

Specificity describes how reliably negative samples are labeled as such:

specificity = TN/(TN +FP)

Pattern Recognition

What are otherwise we can look at a classifiers performance?

Another way of looking at classiﬁer performance is the predictive value of a label.

As the name implies, this metric describes the probability of a sample actually belonging to class X if it was classiﬁed as such:

PPV = Positive Predictive Value = TP/(TP +FP)

NPV = Negative Predictive Value = TN/(TN +FN)

Pattern Recognition

How is F1-measure defined?

F1 measure = 2 ×precision × recall/(precision + recall)

Pattern Recognition

What is k-fold cross-validation? And why do we do it?

With e.g. k = 5, the data are split into 5 equal pieces. In the ﬁrst fold, pieces 1–4 are used for training and piece 5 for testing; in the second fold, piece 4 is used for testing and 1–3, 5 for training; etc.

• every data point is tested exactly once

• but still expensive

We use it to generalize well on new data and to avoid overfitting on the available training data.

Pattern Recognition

What methods other than "k-fold cv" can we used to avoid overfitting and generalize well on new data?

Randomly sample a certain proportion (e.g. 50%, 70%) of the data set as the training set, use the remainder for testing.*Random split:*

• simple

• potentially wastefulTraining is performed on all data but one point (or one pair of 1 positive, 1 negative sample). This is repeated for every possible test set.*Leave-one-out, leave-one-pair-out (LOO/LOPO):*

• very expensive

• maximizes available training dataUses random splits with replacement.*Bootstrapping:*

• allows estimation of bias and variance

Pattern Recognition

How does bias and variance error gets introduced?

Error due to model complexity is called the variance error. Error introduced due to some biases in the data is called bias error.

Pattern Recognition

Can you give an example of a classifier with high bias and high variance?

High bias means the data is being underfitted. The decision boundary is not usually complex enough. High variance happens due to overfitting, the decision boundary is more complex than what it should be.

High bias high variance happens when you fit a complex decision boundary that is also not fitting the training set correctly in several places.

Pattern Recognition

What are the advantages and disadvantages of using naive bayes for spam detection?

Naive Bayes is based on the conditional independence of features assumption – an assumption that is not valid in many real-world scenarios. Hence it sometimes oversimplifies the problem by saying features are independent and gives a sub-par performance. There are chances of under-fitting due to this assumption.*Disadvantages:*However, Naive Bayes is very efficient. It is a model you can train in a single iteration and hence fast to execute. It can be parallelized easily. Naive Bayes works when there are fewer data and lots of features, like bag of words with text data. Due to independence assumption, the number of parameters is less and constant w.r.t data (unlike other algorithms like decision trees). There are fewer chances of overfitting.*Advantages:*

Pattern Recognition

What three types of outlier exists?

are single data points that lay far from the rest of the distribution.*Point outliers*can be noise in data, such as punctuation symbols when realizing text analysis or background noise signal when doing speech recognition.*Contextual outliers*can be subsets of novelties in data such as a signal that may indicate the discovery of new phenomena.*Collective outliers*

Pattern Recognition

Describe how isolation trees detect outliers.

- To build a tree, the algorithm randomly picks a feature from the feature space and a random split value ranging between the maximums and minimums. This is made for all the observations in the training set.
- To build the forest, a tree ensemble is made averaging all the trees in the forest.
- Then for prediction, it compares an observation against that splitting value in a “node”, that node will have two node children on which another random comparison will be made. The number of “splittings” made by the algorithm for an instance is named: “path length”.
- As expected, outliers will have shorter path lengths than the rest of the observations.

Pattern Recognition

Briefly describe how Naive Bayes words. Where is it normally applied?

Naive Bayes is a supervised learning algorithm for classification so the task is to find the class of observation (data point) given the values of features. Naive Bayes classifier calculates the posterior probabilities using Bayes Theorem of it being a specific class when specific features appear. This classifier assumes the features (e.g. words as input) are independent, so the calculation of the class given the features is easier to compute.

It used for

- Real-time Prediction
- Text classification/ Spam Filtering
- Recommendation System

For your degree program Pattern Recognition at the TU München there are already many courses on StudySmarter, waiting for you to join them. Get access to flashcards, summaries, and much more.

Back to TU München overview pageStudySmarter is an intelligent learning tool for students. With StudySmarter you can easily and efficiently create flashcards, summaries, mind maps, study plans and more. Create your own flashcards e.g. for Pattern Recognition at the TU München or access thousands of learning materials created by your fellow students. Whether at your own university or at other universities. Hundreds of thousands of students use StudySmarter to efficiently prepare for their exams. Available on the Web, Android & iOS. It’s completely free.

Best EdTech Startup in Europe