logo资料库

ROBUST ADAPTIVE DYNAMIC PROGRAMMING.pdf

第1页 / 共205页
第2页 / 共205页
第3页 / 共205页
第4页 / 共205页
第5页 / 共205页
第6页 / 共205页
第7页 / 共205页
第8页 / 共205页
资料共205页,剩余部分请下载后查看
ROBUST ADAPTIVE DYNAMIC PROGRAMMING Yu Jiang The MathWorks, Inc. Zhong-Ping Jiang New York University A JOHN WILEY & SONS, INC., PUBLICATION
To my mother, Misi, and Xiaofeng (YJ) To my family (ZPJ)
PREFACE This book covers the topic of adaptive optimal control (AOC) for continuous-time systems. An adaptive optimal controller can gradually modify itself to adapt to the controlled system, and the adaptation is measured by some performance index of the closed-loop system. The study of AOC can be traced back to the 1970s, when researches at the Los Alamos Scientific Laboratory (LASL) started to investigate the use of adaptive and optimal control techniques in buildings with solar-based temper- ature control. Compared with conventional adaptive control, AOC has the important ability to improve energy conservation and system performance. However, even though there are various ways in AOC to compute the optimal controller, most of the previously known approaches are model based, in the sense that a model with a fixed structure is assumed before designing the controller. In addition, these approaches do not generalize to nonlinear models. On the other hand, quite a few model-free, data driven approaches for AOC have emerged in recent years. In particular, adaptive/approximate dynamic programming (ADP) is a powerful methodology that integrates the idea of reinforcement learning (RL) observed from mammalian brain with decision theory so that controllers for man-made systems can learn to achieve optimal performance in spite of uncertainty about the environment and the lack of detailed system models. Since the 1960s, R- L has been brought to the computer science and control science literature as a way to study artificial intelligence, and has been successfully applied to many discrete- v
vi PREFACE time systems, or Markov Decision Processes (MDPs). However, it has always been challenging to generalize those results to the controller design of physical system- s. This is mainly because the state-space of a physical control system is generally continuous and unbounded, and the states are continuous in time. Therefore, the convergence and the stability properties have to be carefully studied for ADP-based approaches. The main purpose of this book is to introduce the recently develope- d framework, known as robust adaptive dynamic programming (RADP), for data- driven, nonmodel-based adaptive optimal control design for both linear and nonlin- ear continuous-time systems. In addition, this book is intended to address in a systematic way the presence of dynamic uncertainty. Dynamic uncertainty exists ubiquitously in control engineer- ing. It is primarily caused by the dynamics which are part of the physical system but either are difficult to be mathematically modeled, or are ignored for the sake of controller design and system analysis. Without addressing the dynamic uncertainty, controller designs based on the simplified model will most likely fail when applied to the physical system. In most of the previously developed ADP or other RL meth- ods, it is assumed that the full state information is always available, and therefore the system order must be known. Although this assumption excludes the existence of any dynamic uncertainty, it is apparently too strong to be realistic. For physical model with a relatively large-scale, knowing the exact number of state variables can be difficult, not to mention not all state variables can be measured precisely. For example, consider a power-grid with both a main generator controlled by the utility company and small distributed generators (DGs) installed by customers. The utility company should not neglect the dynamics of the DGs but should treat them as dy- namic uncertainties when controlling the grid, such that stability, performance, and power security can be always maintained as expected. The book is organized in four parts. First, an overview of RL, ADP, and RADP is contained in Chapter 1. Second, a few recently developed continuous-time ADP methods are introduced in Chapters 2, 3, and 4. Chapter 2 covers the topic of ADP for uncertain linear systems. Chapters 3 and 4 provide neural-network-based and sum-of-squares (SOS)-based ADP methodologies to achieve semi-global and global stabilization for uncertain nonlinear continuous-time systems, respectively. Third, Chapters 5 and 6 focus on RADP for linear and nonlinear systems, with dynamic uncertainties rigorously addressed. In Chapter 5, different robustification schemes are introduced to achieve RADP. Chapter 6 further extends the RADP framework for large-scale systems and illustrates its applicability to industrial power systems. Finally, Chapter 7 applies ADP and RADP to study the sensorimotor control of hu- mans, and the results suggest that humans may be using very similar approaches to learn to coordinate movements to handle uncertainties in our daily lives. This book makes a major departure from most existing texts covering the same topics by providing many practical examples such as power systems and human sen- sorimotor control systems to illustrate the effectiveness of our results. The book uses MATLAB in each chapter to conduct numerical simulations. MATLAB is used as a computational tool, a programming tool, and a graphical tool. Simulink, a graphi- cal programming environment for modeling, simulating and analyzing multidomain
PREFACE vii dynamic systems, is used in Chapter 2. The third-party MATLAB-based software SOSTOOLS and CVX are used in Chapters 4 and 5 to solve SOS programs and semidefinite programs (SDP). All MATLAB programs and the Simulink model de- veloped in this book as well as extension of these programs are available at http://yu- jiang.github.io/radpbook/. The development of this book would not have been possible without the support and help of many people. The authors wish to thank Professor Frank Lewis and Dr. Paul Werbos whose seminal work on adaptive/approximate dynamic programming has laid down the foundation of the book. The first-named author (YJ) would like to thank his Master’s Thesis adviser Prof. Jie Huang for guiding him into the area of nonlinear control, and Dr. Yebin Wang for offering him a summer research intern- ship position at Mitsubishi Electric Research Laboratories, where parts of the ideas in Chapters 4 and 5 were originally inspired. The second-named author (ZPJ) would like to acknowledge his colleagues - specially Drs. Alessandro Astolfi, Lei Guo, Iven Mareels, and Frank Lewis – for many useful comments and constructive criticism on some of the research summarized in the book. He is grateful to his students for the boldness in entering the interesting yet still unpopular field of data-driven adaptive optimal control. The authors wish to thank the editors and editorial staff, in partic- ular, Mengchu Zhou, Mary Hatcher, Brady Chin, and Divya Narayanan, for their efforts in publishing the book. We thank Tao Bian and Weinan Gao for collabora- tion on generalizations and applications of ADP based on the framework of RADP presented in this book. Finally, we thank our families for their sacrifice in adapting to our hard-to-predict working schedules that often involve dynamic uncertainties. From our family members, we have learned the importance of exploration noise in achieving the desired trade-off between robustness and optimality. The bulk of this research was accomplished while the first-named author was working towards his PhD degree in the Control and Networks Lab at New York University Tandon School of Engineering. The authors wish to acknowledge the research funding support by the National Science Foundation. Wellesley, Massachusetts July, 2016 Brooklyn, New York July, 2016 YU JIANG ZHONG-PING JIANG
分享到:
收藏