Your peers in the course Data Mining and KD at the TU München create and share summaries, flashcards, study plans and other learning materials with the intelligent StudySmarter learning app.

Get started now!

Data Mining and KD

ROC - explanation

Scatter plot of the true positive rate TPR and the false positive rate FPR

Data Mining and KD

PR - breakeven point

- The main diagonal of precision recall
- Important classification criterion
- High breakeven point = Good classifier

Data Mining and KD

Linear discriminant Analysis - how to calculate?

- You want to find wx + b
- w = mean1 – mean 2
- b = – w * mean1+mean2/2

Data Mining and KD

Median filter, when?

- Series data
- When we have outliers
- Remove noise

Data Mining and KD

Edit distance

- The
**minimum**number of**edit****operations** **Operations**: insert, delete, or change a sequence element

We denote

Lij(x; y) as the edit distance between the ﬁrst i elements of x and the ﬁrst j elements of y

Data Mining and KD

Fuzzy clustering, when?

- Good results, even if the clusters are overlapping and data are noisy
- Sensitive to outliers.

Outliers are equivalent to other data points that

are equidistant to all data points like the middle point. But intuitively we expect outliers to have low membership

Data Mining and KD

ID3, when to use?

- Extension of classification and regression tree
Accept real-valued and missing features

- Uses a pruning mechanism to reduce tree size

Data Mining and KD

Principal component analysis - when?

Data Mining and KD

Hypercube standardization is appropriate for

Data Mining and KD

Mean and variance standardization is appropriate for

Data Mining and KD

How to calculate the principal axis (PCA)

- Calculate mean per feature
- Calculate covariance matrix
- 1/n SUM( (x-x_mean)*(y-y_mean) )

- A-ILambda
- Get lambda
- go back to your covariance matrix and make
- First row = x11*Lambda
- Second row = x12* Lambda
- etc

- Get a relationship between x11, x12 etc for all different lambdas
- Normalise them. Divide them on Sqr(a^2 + b^2 …)
- You now have your principal axes

Data Mining and KD

When does Naive Bayesian Classifier not work?

- When classes are based on correlation
- Variance difference
- Hight -> + Weight -> –
- Hight -> + Weight -> +

For your degree program Computer Science at the TU München there are already many courses on StudySmarter, waiting for you to join them. Get access to flashcards, summaries, and much more.

Back to TU München overview pageStudySmarter is an intelligent learning tool for students. With StudySmarter you can easily and efficiently create flashcards, summaries, mind maps, study plans and more. Create your own flashcards e.g. for Data Mining and KD at the TU München or access thousands of learning materials created by your fellow students. Whether at your own university or at other universities. Hundreds of thousands of students use StudySmarter to efficiently prepare for their exams. Available on the Web, Android & iOS. It’s completely free.

Best EdTech Startup in Europe

1## Learning Plan

2## Flashcards

3## Summaries

4## Teamwork

5## Feedback