logo资料库

Markov Decision Processes With Their Applications.pdf

第1页 / 共305页
第2页 / 共305页
第3页 / 共305页
第4页 / 共305页
第5页 / 共305页
第6页 / 共305页
第7页 / 共305页
第8页 / 共305页
资料共305页,剩余部分请下载后查看
MARKOV DECISION PROCESSES WITH THEIR APPLICATIONS
Advances in Mechanics and Mathematics VOLUME 14 Series Editor: David Y. Gao Virginia Polytechnic Institute and State University, U.S.A Ray W. Ogden University of Glasgow, U.K. Advisory Editors: I. Ekeland University of British Columbia, Canada S. Liao Shanghai Jiao Tung University, P.R. China K.R. Rajagopal Texas A&M University, U.S.A. T. Ratiu Ecole Polytechnique, Switzerland W. Yang Tsinghua University, P.R. China
MARKOV DECISION PROCESSES WITH THEIR APPLICATIONS By Prof. Ph.D. Qiying Hu Fudan University, China Prof. Ph.D. Wuyi Yue Konan University, Japan
Library of Congress Control Number: 2006930245 ISBN-13: 978-0-387-36950-1 e-ISBN-13: 978-0-387-36951-8 Printed on acid-free paper. AMS Subject Classifications: 90C40, 90C39, 93C65, 91B26, 90B25 © 2008 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 9 8 7 6 5 4 3 2 1 springer.com
Contents List of Figures List of Tables Preface Acknowledgments 1. INTRODUCTION 1 2 3 A Brief Description of Markov Decision Processes Overview of the Book Organization of the Book 2. DISCRETE TIME MARKOV DECISION PROCESSES: TOTAL REWARD 1 System Model Some Concepts Finiteness of the Reward Model and Preliminaries 1.1 1.2 1.3 Optimality Equation 2.1 2.2 Properties of Optimal Policies Successive Approximation Sufficient Conditions Notes and References 2 3 4 5 6 Validity of the Optimality Equation Properties of the Optimality Equation 3. DISCRETE TIME MARKOV DECISION PROCESSES: AVERAGE CRITERION 1 2 Model and Preliminaries Optimality Equation ix xi xiii xv 1 1 4 6 11 11 11 12 14 17 17 21 25 30 32 34 39 39 43
vi MARKOV DECISION PROCESSES WITH THEIR APPLICATIONS 2.1 2.2 2.3 Optimality Inequalities 3.1 3.2 Notes and References 3 4 Properties of ACOE and Optimal Policies Sufficient Conditions Recurrent Conditions Conditions Properties of ACOI and Optimal Policies 4. CONTINUOUS TIME MARKOV DECISION PROCESSES 1 2 3 4 Model and Conditions Model Decomposition Some Properties Optimality Equation and Optimal Policies A Stationary Model: Total Reward 1.1 1.2 1.3 1.4 A Nonstationary Model: Total Reward 2.1 2.2 A Stationary Model: Average Criterion Notes and References Model and Conditions Optimality Equation 5. SEMI-MARKOV DECISION PROCESSES 1 2 3 Model and Conditions Model 1.1 Regular Conditions 1.2 Criteria 1.3 Transformation 2.1 2.2 Notes and References Total Reward Average Criterion 6. MARKOV DECISION PROCESSES IN SEMI-MARKOV ENVIRONMENTS 1 Continuous Time MDP in Semi-Markov Environments 1.1 1.2 1.3 1.4 1.5 SMDP in Semi-Markov Environments Model Optimality Equation Approximation by Weak Convergence Markov Environment Phase Type Environment 2 44 48 50 53 54 57 60 63 63 63 67 71 77 85 85 87 95 101 105 105 105 107 110 111 112 115 119 121 121 121 127 137 140 143 148
Contents 3 4 Model Optimality Equation Markov Environment 2.1 2.2 2.3 Mixed MDP in Semi-Markov Environments 3.1 3.2 3.3 Notes and References Model Optimality Equation Markov Environment 7. OPTIMAL CONTROL OF DISCRETE EVENT SYSTEMS: 8. OPTIMAL CONTROL OF DISCRETE EVENT SYSTEMS: I 1 2 3 4 5 6 2 II 1 2 3 4 5 6 7 Maximum Discounted Total Reward Minimum Discounted Total Reward System Model Optimality 2.1 2.2 Optimality in Event Feedback Control Link to Logic Level Resource Allocation System Notes and References System Model Optimality Equation and Optimal Supervisors Language Properties System Based on Automaton Supervisory Control Problems 5.1 5.2 Job-Matching Problem Notes and References Event Feedback Control State Feedback Control 9. OPTIMAL REPLACEMENT UNDER STOCHASTIC ENVIRONMENTS 1 Optimal Replacement: Discrete Time 1.1 1.2 1.3 Optimal Replacement: Semi-Markov Processes Problem and Model Total Cost Criterion Average Criterion vii 148 152 158 160 160 163 170 174 177 177 180 182 186 186 189 194 201 203 203 207 213 215 218 218 222 223 230 233 234 234 238 241 244
分享到:
收藏