Achtung:

Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

[Translate to English:] Prof. Dr.-Ing. Joachim Böcker Show image information

[Translate to English:] Prof. Dr.-Ing. Joachim Böcker

Reinforcement Learning

Students: Elective  master program
Lecturer: Dr.-Ing. Oliver Wallscheid

Procedure in the summer term 2020:

The course is organized in the summer term 2020 in a digital format.
You find all information and teaching materials at Panda:
https://panda.uni-paderborn.de/course/view.php?id=11486

 

The course covers the basics of reinforcement learning (RL) in an engineering context.  RL stands for a series of methods of machine learning in which an agent independently learns a strategy (policy) to maximize the rewards received during interaction with an (unknown) system. This can be, for example, a control loop in which an adaptive controller tries to determine an optimal control law from previous observations of the control and measurement variables, which maximizes certain benchmark criteria with regard to controller performance. Well-known fields of application include the operation of autonomous vehicles and industrial robots or the identification of optimal strategies in the context of leisure games.

The course has an application-oriented focus in the engineering sciences but is also designed for students of natural sciences (e.g. computer science, mathematics). In addition to teaching the methodological fundamentals within the lecture, great importance is attached to practical implementation and programming tasks during the exercise and tutorial hours.

 The course will cover the following content:

  • Conceptual basics and historical overview
  • Markov decision processes
  • Dynamic programming
  • Monte Carlo learning
  • Temporal difference learning
  • Bootstrapping
  • Function approximation and deep learning
  • On- and Off-policy strategies
  • Policy gradient methods
  • Safe RL
  • Integration of expert knowledge

 

 

 

 

 

 

The University for the Information Society