ROBUST ADAPTIVE DYNAMIC PROGRAMMING.pdf-资料库

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第1页.png

第1页 / 共205页

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第2页.png

第2页 / 共205页

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第3页.png

第3页 / 共205页

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第4页.png

第4页 / 共205页

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第5页.png

第5页 / 共205页

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第6页.png

第6页 / 共205页

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第7页.png

第7页 / 共205页

cbecf479-cf2c-4d8c-bfcf-1650100bce18.pdf-第8页.png

第8页 / 共205页

ROBUST ADAPTIVE DYNAMIC PROGRAMMING Yu Jiang The MathWorks, Inc. Zhong-Ping Jiang New York University A JOHN WILEY & SONS, INC., PUBLICATION

To my mother, Misi, and Xiaofeng (YJ) To my family (ZPJ)

PREFACE This book covers the topic of adaptive optimal control (AOC) for continuous-time systems. An adaptive optimal controller can gradually modify itself to adapt to the controlled system, and the adaptation is measured by some performance index of the closed-loop system. The study of AOC can be traced back to the 1970s, when researches at the Los Alamos Scientiﬁc Laboratory (LASL) started to investigate the use of adaptive and optimal control techniques in buildings with solar-based temper- ature control. Compared with conventional adaptive control, AOC has the important ability to improve energy conservation and system performance. However, even though there are various ways in AOC to compute the optimal controller, most of the previously known approaches are model based, in the sense that a model with a ﬁxed structure is assumed before designing the controller. In addition, these approaches do not generalize to nonlinear models. On the other hand, quite a few model-free, data driven approaches for AOC have emerged in recent years. In particular, adaptive/approximate dynamic programming (ADP) is a powerful methodology that integrates the idea of reinforcement learning (RL) observed from mammalian brain with decision theory so that controllers for man-made systems can learn to achieve optimal performance in spite of uncertainty about the environment and the lack of detailed system models. Since the 1960s, R- L has been brought to the computer science and control science literature as a way to study artiﬁcial intelligence, and has been successfully applied to many discrete- v

vi PREFACE time systems, or Markov Decision Processes (MDPs). However, it has always been challenging to generalize those results to the controller design of physical system- s. This is mainly because the state-space of a physical control system is generally continuous and unbounded, and the states are continuous in time. Therefore, the convergence and the stability properties have to be carefully studied for ADP-based approaches. The main purpose of this book is to introduce the recently develope- d framework, known as robust adaptive dynamic programming (RADP), for data- driven, nonmodel-based adaptive optimal control design for both linear and nonlin- ear continuous-time systems. In addition, this book is intended to address in a systematic way the presence of dynamic uncertainty. Dynamic uncertainty exists ubiquitously in control engineer- ing. It is primarily caused by the dynamics which are part of the physical system but either are difﬁcult to be mathematically modeled, or are ignored for the sake of controller design and system analysis. Without addressing the dynamic uncertainty, controller designs based on the simpliﬁed model will most likely fail when applied to the physical system. In most of the previously developed ADP or other RL meth- ods, it is assumed that the full state information is always available, and therefore the system order must be known. Although this assumption excludes the existence of any dynamic uncertainty, it is apparently too strong to be realistic. For physical model with a relatively large-scale, knowing the exact number of state variables can be difﬁcult, not to mention not all state variables can be measured precisely. For example, consider a power-grid with both a main generator controlled by the utility company and small distributed generators (DGs) installed by customers. The utility company should not neglect the dynamics of the DGs but should treat them as dy- namic uncertainties when controlling the grid, such that stability, performance, and power security can be always maintained as expected. The book is organized in four parts. First, an overview of RL, ADP, and RADP is contained in Chapter 1. Second, a few recently developed continuous-time ADP methods are introduced in Chapters 2, 3, and 4. Chapter 2 covers the topic of ADP for uncertain linear systems. Chapters 3 and 4 provide neural-network-based and sum-of-squares (SOS)-based ADP methodologies to achieve semi-global and global stabilization for uncertain nonlinear continuous-time systems, respectively. Third, Chapters 5 and 6 focus on RADP for linear and nonlinear systems, with dynamic uncertainties rigorously addressed. In Chapter 5, different robustiﬁcation schemes are introduced to achieve RADP. Chapter 6 further extends the RADP framework for large-scale systems and illustrates its applicability to industrial power systems. Finally, Chapter 7 applies ADP and RADP to study the sensorimotor control of hu- mans, and the results suggest that humans may be using very similar approaches to learn to coordinate movements to handle uncertainties in our daily lives. This book makes a major departure from most existing texts covering the same topics by providing many practical examples such as power systems and human sen- sorimotor control systems to illustrate the effectiveness of our results. The book uses MATLAB in each chapter to conduct numerical simulations. MATLAB is used as a computational tool, a programming tool, and a graphical tool. Simulink, a graphi- cal programming environment for modeling, simulating and analyzing multidomain

PREFACE vii dynamic systems, is used in Chapter 2. The third-party MATLAB-based software SOSTOOLS and CVX are used in Chapters 4 and 5 to solve SOS programs and semideﬁnite programs (SDP). All MATLAB programs and the Simulink model de- veloped in this book as well as extension of these programs are available at http://yu- jiang.github.io/radpbook/. The development of this book would not have been possible without the support and help of many people. The authors wish to thank Professor Frank Lewis and Dr. Paul Werbos whose seminal work on adaptive/approximate dynamic programming has laid down the foundation of the book. The ﬁrst-named author (YJ) would like to thank his Master’s Thesis adviser Prof. Jie Huang for guiding him into the area of nonlinear control, and Dr. Yebin Wang for offering him a summer research intern- ship position at Mitsubishi Electric Research Laboratories, where parts of the ideas in Chapters 4 and 5 were originally inspired. The second-named author (ZPJ) would like to acknowledge his colleagues - specially Drs. Alessandro Astolﬁ, Lei Guo, Iven Mareels, and Frank Lewis – for many useful comments and constructive criticism on some of the research summarized in the book. He is grateful to his students for the boldness in entering the interesting yet still unpopular ﬁeld of data-driven adaptive optimal control. The authors wish to thank the editors and editorial staff, in partic- ular, Mengchu Zhou, Mary Hatcher, Brady Chin, and Divya Narayanan, for their efforts in publishing the book. We thank Tao Bian and Weinan Gao for collabora- tion on generalizations and applications of ADP based on the framework of RADP presented in this book. Finally, we thank our families for their sacriﬁce in adapting to our hard-to-predict working schedules that often involve dynamic uncertainties. From our family members, we have learned the importance of exploration noise in achieving the desired trade-off between robustness and optimality. The bulk of this research was accomplished while the ﬁrst-named author was working towards his PhD degree in the Control and Networks Lab at New York University Tandon School of Engineering. The authors wish to acknowledge the research funding support by the National Science Foundation. Wellesley, Massachusetts July, 2016 Brooklyn, New York July, 2016 YU JIANG ZHONG-PING JIANG

资料库

ROBUST ADAPTIVE DYNAMIC PROGRAMMING.pdf

相关推荐

课程资源

热门标签

最新资料