Is there a global solution in regression tasks?

Yes

Goal of using generalization error:

Minimizing the generalization error (learning) is concerned with optimizing bias and variance simultaneously.

Pitfall of Empirical risk minimization:

Empirical risk minimization does not include any mechanism to assess bias and variance independently (how should it?).

How to solve a dual problem?

1) Lagrange function

2) Karush-Kuhn-Tucker (KKT)

What is a good choise if we want linearly seperated data?

SVMs

What is a good choise if we want non-linearly seperated data?

C-SVMs

Linear separability is very restrictive. Where can it be achieved easier?

The higher the dimensionality, the easier linear separability can be achieved.

What learning algorithm and search type do all decision trees have?

All decision tree learning algorithms are recursive, depth-first search algorithms that perform hierarchical splits.

Which types of features are there when splitting in decision tree?

Categorical and numerical features

Standard splitting algorithm for classification:

Standard splitting criterion employed by the decision tree algorithm CART for classification.

How to estimate OOB-Error?

We take all trees that don't contain a certain datapoint. Then we take that datapoint and put it into all of those trees. Then we compute the predicted average over all those trees for this datapoint. We do that for all datapoints and take the average.

This is almost identical to cross validation.

Best loss type for linear regression?

