While Dynamic Programming (DP) has helped solve control problems involving dynamic systems, its value was limited by algorithms that lacked practical scale-up capacity. In recent years, developments in Reinforcement Learning (RL), DP's model-free counterpart, has changed this. Focusing on continuous-variable problems, this unparalleled work provides an introduction to classical RL and DP, followed by a presentation of current methods in RL and DP with approximation. Combining algorithm development with theoretical guarantees, it offers illustrative examples that readers will be able to adapt to their own work.



Autorentext

Robert Babuska, Lucian Busoniu, and Bart de Schutter are with the Delft University of Technology. Damien Ernst is with the University of Liege.



Klappentext

From household appliances to applications in robotics, engineered systems involving complex dynamics can only be as effective as the algorithms that control them. This title provides a comprehensive exploration of the field of Dynamic Programming (DP) and Reinforcement Learning (RL).



Inhalt

1 Introduction
The dynamic programming and reinforcement learning problem
Approximation in dynamic programming and reinforcement learning
About this book
2 An introduction to dynamic programming and reinforcement learning
Introduction
Markov decision processes
Value iteration
Policy iteration
Policy search
Summary and discussion
3 Dynamic programming and reinforcement learning in large and continuous
spaces
Introduction
The need for approximation in large and continuous spaces
Approximation architectures
Approximate value iteration
Approximate policy iteration
Finding value function approximators automatically
Approximate policy search
Comparison of approximate value iteration, policy iteration, and policy search
Summary and discussion
4 Approximate value iteration with a fuzzy representation
Introduction
Fuzzy Q-iteration
Analysis of fuzzy Q-iteration
Optimizing the membership functions
Experimental study
Summary and discussion
5 Approximate policy iteration for online learning and continuous-action control
Introduction
A recapitulation of least-squares policy iteration
Online least-squares policy iteration
Online LSPI with prior knowledge
LSPI with continuous-action, polynomial approximation
Experimental study
Summary and discussion
6 Approximate policy search with cross-entropy optimization of basis functions
Introduction
Cross-entropy optimization
Cross-entropy policy search
Experimental study
Summary and discussion
Appendix A Extremely randomized trees
Structure of the approximator
Building and using a tree
Appendix B The cross-entropy method
Rare-event simulation using the cross-entropy method
Cross-entropy optimization
Symbols and abbreviations
Bibliography
List of algorithms
Index

Titel
Reinforcement Learning and Dynamic Programming Using Function Approximators
EAN
9781439821091
Format
E-Book (pdf)
Genre
Veröffentlichung
28.07.2017
Digitaler Kopierschutz
Adobe-DRM
Anzahl Seiten
280