0

Reinforcement Learning and Approximate Dynamic Programming for Feedback Control

IEEE Press Series on Computational Intelligence

L Lewis, Frank / Liu, /
Erschienen am 01.12.2012, Auflage: 1. Auflage
CHF 207,00
(inkl. MwSt.)

Nicht lieferbar

In den Warenkorb
Bibliografische Daten
ISBN/EAN: 9781118104200
Sprache: Englisch
Umfang: 648
Einband: Gebunden

Beschreibung

Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making.

Autorenportrait

InhaltsangabePREFACE xix CONTRIBUTORS xxiii PART I FEEDBACK CONTROL USING RL AND ADP 1. Reinforcement Learning and Approximate Dynamic Programming (RLADP)--Foundations, Common Misconceptions, and the Challenges Ahead 3 Paul J. Werbos 1.1 Introduction 3 1.2 What is RLADP? 4 1.3 Some Basic Challenges in Implementing ADP 14 2. Stable Adaptive Neural Control of Partially Observable Dynamic Systems 31 J. Nate Knight and Charles W. Anderson 2.1 Introduction 31 2.2 Background 32 2.3 Stability Bias 35 2.4 Example Application 38 3. Optimal Control of Unknown Nonlinear Discrete-Time Systems Using the Iterative Globalized Dual Heuristic Programming Algorithm 52 Derong Liu and Ding Wang 3.1 Background Material 53 3.2 NeuroOptimal Control Scheme Based on the Iterative ADP Algorithm 55 3.3 Generalization 67 3.4 Simulation Studies 68 3.5 Summary 74 4. Learning and Optimization in Hierarchical Adaptive Critic Design 78 Haibo He, Zhen Ni, and Dongbin Zhao 4.1 Introduction 78 4.2 Hierarchical ADP Architecture with Multiple-Goal Representation 80 4.3 Case Study: The Ball-and-Beam System 87 4.4 Conclusions and Future Work 94 5. Single Network Adaptive Critics Networks--Development, Analysis, and Applications 98 Jie Ding, Ali Heydari, and S.N. Balakrishnan 5.1 Introduction 98 5.2 Approximate Dynamic Programing 100 5.3 SNAC 102 5.4 JSNAC 104 5.5 FiniteSNAC 108 5.6 Conclusions 116 6. Linearly Solvable Optimal Control 119 K. Dvijotham and E. Todorov 6.1 Introduction 119 6.2 Linearly Solvable Optimal Control Problems 123 6.3 Extension to Risk-Sensitive Control and Game Theory 130 6.4 Properties and Algorithms 134 6.5 Conclusions and Future Work 139 7. Approximating Optimal Control with Value Gradient Learning 142 Michael Fairbank, Danil Prokhorov, and Eduardo Alonso 7.1 Introduction 142 7.2 Value Gradient Learning and BPTT Algorithms 144 7.3 A Convergence Proof for VGL(1) for Control with Function Approximation 148 7.4 Vertical Lander Experiment 154 7.5 Conclusions 159 8. A Constrained Backpropagation Approach to Function Approximation and Approximate Dynamic Programming 162 Silvia Ferrari, Keith Rudd, and Gianluca Di Muro 8.1 Background 163 8.2 Constrained Backpropagation (CPROP) Approach 163 8.3 Solution of Partial Differential Equations in Nonstationary Environments 170 8.4 Preserving Prior Knowledge in Exploratory Adaptive Critic Designs 174 8.5 Summary 179 9. Toward Design of Nonlinear ADP Learning Controllers with Performance Assurance 182 Jennie Si, Lei Yang, Chao Lu, Kostas S. Tsakalis, and Armando A. Rodriguez 9.1 Introduction 183 9.2 Direct Heuristic Dynamic Programming 184 9.3 A Control Theoretic View on the Direct HDP 186 9.4 Direct HDP Design with Improved Performance Case 1--Design Guided by a Priori LQR Information 193 9.5 Direct HDP Design with Improved Performance Case 2--Direct HDP for Coorindated Damping Control of Low-Frequency Oscillation 198 9.6 Summary 201 10. Reinforcement Learning Control with Time-Dependent Agent Dynamics 203 Kenton Kirkpatrick and John Valasek 10.1 Introduction 203 10.2 QLearning 205 10.3 Sampled Data Q-Learning 209 10.4 System Dynamics Approximation 213 10.5 Closing Remarks 218 11. Online Optimal Control of Nonaffine Nonlinear Discrete-Time Systems without Using Value and Policy Iterations 221 Hassan Zargarzadeh, Qinmin Yang, and S. Jagannathan 11.1 Introduction 221 11.2 Background 224 11.3 Reinforcement Learning Based Control 225 11.4 TimeBased Adaptive Dynamic ProgrammingBased Optimal Control 234 11.5 Simulation Result 247 12. An ActorCriticIdentifier Architecture for Adaptive Approximate Optimal Control 258 S. Bhasin, R. Kamalapurkar, M. Johnson, K.G. Vamvoudakis, F.L. Lewis, and W.E. Dixon 12.1 Introduction 259 12.2 ActorCriticIdentifier Architecture for HJB Approximation 260 12.3 ActorC

Leseprobe

Leseprobe

Wie bewerten Sie dieses Produkt?

arche_schild_cafe_0.jpg

 

 

Besuchen Sie uns in Liestal!

Die ARCHE führt ein breites Angebot an christlichen Büchern

und vielen weiteren Artikeln. 

Unser kleines Café lädt zum Verweilen ein.

Immer wieder finden Anlässe wie Lesungen und Beratungen statt.