ADP & RL at TU München

Arrow

100% for free

Arrow

Efficient learning

Arrow

100% for free

Arrow

Efficient learning

Arrow

Synchronization on all devices

Arrow Arrow

It’s completely free

studysmarter schule studium
d

4.5 /5

studysmarter schule studium
d

4.8 /5

studysmarter schule studium
d

4.5 /5

studysmarter schule studium
d

4.8 /5

Study with flashcards and summaries for the course ADP & RL at the TU München

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

What are Stochastic Approximation algorithms?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

Explain Monte Carlo PI

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

What is the motivation for Value Function Approximation?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

What is the motivation for off-policy learning?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

Does TD learning work both on VI and PI?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

What is the policy improvement theorem?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

Name two algorithms based on Monte Carlo Estimation.

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

How does VI with Linear Value Function Approximation work?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

How do you estimate the target policy from the behavior policy?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

What is expected SARSA?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

By which law are Monte Carlo methods justified?

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

What are the key components of Monte Carlo methods?

Your peers in the course ADP & RL at the TU München create and share summaries, flashcards, study plans and other learning materials with the intelligent StudySmarter learning app.

Get started now!

Flashcard Flashcard

Exemplary flashcards for ADP & RL at the TU München on StudySmarter:

ADP & RL

What are Stochastic Approximation algorithms?
Root finding problems that are used when the data is noisy. The function is represented as an expected value

ADP & RL

Explain Monte Carlo PI
In policy evaluation compute the mean (iteratively) instead of the expectation
Perform policy improvement as usual

ADP & RL

What is the motivation for Value Function Approximation?
Curse of dimensionality: There are too many states and actions to store in memory, and it would be too slow to learn the value for each state individually

ADP & RL

What is the motivation for off-policy learning?
Learn about a policy (target policy) from experience sampled from another one (behavior policy)

ADP & RL

Does TD learning work both on VI and PI?
It only works with PI (just sample in policy evaluation step),
doesn’t work with VI (impossible to sample minimization of expectation)

ADP & RL

What is the policy improvement theorem?
The policy improvement step returns either a strictly improved policy or the optimal one

ADP & RL

Name two algorithms based on Monte Carlo Estimation.
LSTD, LSPE

ADP & RL

How does VI with Linear Value Function Approximation work?
minimizes the error of the estimated value function to the optimal one (direct) or to the optimal Bellman equation (indirectly)

ADP & RL

How do you estimate the target policy from the behavior policy?
by importance sampling

ADP & RL

What is expected SARSA?
It used the expectation over different samples from the target policy

ADP & RL

By which law are Monte Carlo methods justified?
The law of large numbers: the mean over a large number of samples is the expected value

ADP & RL

What are the key components of Monte Carlo methods?
– Define a domain of possible inputs
– Generate inputs randomly from a probability distribution over the domain
– Perform a deterministic computation on the inputs
– Aggregate the results

Sign up for free to see all flashcards and summaries for ADP & RL at the TU München

Singup Image Singup Image
Wave

Other courses from your degree program

For your degree program at the TU München there are already many courses on StudySmarter, waiting for you to join them. Get access to flashcards, summaries, and much more.

Back to TU München overview page

What is StudySmarter?

What is StudySmarter?

StudySmarter is an intelligent learning tool for students. With StudySmarter you can easily and efficiently create flashcards, summaries, mind maps, study plans and more. Create your own flashcards e.g. for ADP & RL at the TU München or access thousands of learning materials created by your fellow students. Whether at your own university or at other universities. Hundreds of thousands of students use StudySmarter to efficiently prepare for their exams. Available on the Web, Android & iOS. It’s completely free.

Awards

Best EdTech Startup in Europe

Awards
Awards

EUROPEAN YOUTH AWARD IN SMART LEARNING

Awards
Awards

BEST EDTECH STARTUP IN GERMANY

Awards
Awards

Best EdTech Startup in Europe

Awards
Awards

EUROPEAN YOUTH AWARD IN SMART LEARNING

Awards
Awards

BEST EDTECH STARTUP IN GERMANY

Awards

How it works

Top-Image

Get a learning plan

Prepare for all of your exams in time. StudySmarter creates your individual learning plan, tailored to your study type and preferences.

Top-Image

Create flashcards

Create flashcards within seconds with the help of efficient screenshot and marking features. Maximize your comprehension with our intelligent StudySmarter Trainer.

Top-Image

Create summaries

Highlight the most important passages in your learning materials and StudySmarter will create a summary for you. No additional effort required.

Top-Image

Study alone or in a group

StudySmarter automatically finds you a study group. Share flashcards and summaries with your fellow students and get answers to your questions.

Top-Image

Statistics and feedback

Always keep track of your study progress. StudySmarter shows you exactly what you have achieved and what you need to review to achieve your dream grades.

1

Learning Plan

2

Flashcards

3

Summaries

4

Teamwork

5

Feedback