logo资料库

Decision Making Under Uncertainty.pdf

第1页 / 共336页
第2页 / 共336页
第3页 / 共336页
第4页 / 共336页
第5页 / 共336页
第6页 / 共336页
第7页 / 共336页
第8页 / 共336页
资料共336页,剩余部分请下载后查看
Cover
Contents
Preface
Authors
Acknowledgment
1 Introduction
1.1 Decision Making
1.2 Example Applications
1.2.1 Traffic Alert and Collision Avoidance System
1.2.2 Unmanned Aircraft Persistent Surveillance
1.3 Methods for Designing Decision Agents
1.3.1 Explicit Programming
1.3.2 Supervised Learning
1.3.4 Planning
1.3.5 Reinforcement Learning
1.4 Overview
1.5 Further Reading
References
2 Probabilistic Models
2.1 Representation
2.1.1 Degrees of Belief and Probability
2.1.2 Probability Distributions
2.1.3 Joint Distributions
2.1.4 Bayesian Network Representation
2.1.5 Conditional Independence
2.1.6 Hybrid Bayesian Networks
2.1.7 Temporal Models
2.2 Inference
2.2.1 Inference for Classification
2.2.2 Inference in Temporal Models
2.2.3 Exact Inference
2.2.4 Complexity of Exact Inference
2.2.5 Approximate Inference
2.3 Paramete Learning
2.3.1 Maximum Likelihood Parameter Learning
2.3.2 Bayesian Parameter Learning
2.3.3 Nonparametric Learning
2.4 Structure Learning
2.4.1 Bayesian Structure Scoring
2.4.2 Directed Graph Search
2.4.3 Markov Equivalence Classes
2.4.4 Partially Directed Graph Search
2.5 Summary
2.6 Further Reading
References
3 Decision Problems
3.1 Utility Theory
3.1.1 Constraints on Rational Preferences
3.1.2 Utility Functions
3.1.3 Maximum Expected Utility Principle
3.1.4 Utility Elicitation
3.1.5 Utility of Money
3.1.6 Multiple Variable Utility Functions
3.1.7 Irrationality
3.2 Decision Networks
3.2.1 Evaluating Decision Networks
3.2.2 Value of Information
3.2.3 Creating Decision Networks
3.3 Games
3.3.1 Dominant Strategy Equilibrium
3.3.2 Nash Equilibrium
3.3.3 Behavioral Game Theory
3.4 Summary
3.5 Further Reading
References
4 Sequential Problems
4.1 Formulation
4.1.1 Markov Decision Processes
4.1.2 Utility and Reward
4.2 Dynamic Programming
4.2.1 Policies and Utilities
4.2.2 Policy Evaluation
4.2.3 Policy Iteration
4.2.4 Value Iteration
4.2.5 Grid World Example
4.2.6 Asynchronous Value Iteration
4.2.7 Closed- and Open-Loop Planning
4.3 Structured Representations
4.3.1 Factored Markov Decision Processes
4.3.2 Structured Dynamic Programming
4.4 Linear Representations
4.5 Approximate Dynamic Programming
4.5.1 Local Approximation
4.5.2 Global Approximation
4.6 Online Methods
4.6.1 Forward Search
4.6.2 Branch and Bound Search
4.6.3 Sparse Sampling
4.6.4 Monte Carlo Tree Search
4.7 Direct Policy Search
4.7.1 Objective Function
4.7.2 Local Search Methods
4.7.3 Cross Entropy Methods
4.7.4 Evolutionary Methods
4.8 Summary
4.9 Further Reading
References
5 Model Uncertainty
5.1 Exploration and Exploitation
5.1.1 Multi-Armed Bandit Problems
5.1.2 Bayesian Model Estimation
5.1.3 Ad Hoc Exploration Strategies
5.1.4 Optimal Exploration Strategies
5.2 Maximum Likelihood Model-Based Methods
5.2.1 Randomized Updates
5.2.2 Prioritized Updates
5.3 Bayesian Model-Based Methods
5.3.1 Problem Structure
5.3.2 Beliefs over Model Parameters
5.3.3 Bayes-Adaptive Markov Decision Processes
5.3.4 Solution Methods
5.4 Model-Free Methods
5.4.1 Incremental Estimation
5.4.2 Q-Learning
5.4.3 Sarsa
5.4.4 Eligibility Traces
5.5 Generalization
5.5.1 Local Approximation
5.5.2 Global Approximation
5.5.3 Abstraction Methods
5.6 Summary
5.7 Further Reading
References
6 State Uncertainty
6.1 Formulation
6.1.1 Example Problem
6.1.2 Partially Observable Markov Decision Processes
6.1.3 Policy Exceution
6.1.4 Belief-State Markov Decision Processes
6.2 Belifef Updating
6.2.1 Discrete State Filter
6.2.2 Linear-Gaussian Filter
6.2.3 Particle Filter
6.3 Exact Solution Methods
6.3.1 Alpha Vectors
6.3.2 Conditional Plans
6.3.3 Value Iteration
6.4 Offline Methods
6.4.1 Fully Observable Value Approximation
6.4.2 Fast Informed Bound
6.4.3 Point-Based Value Iteration
6.4.4 Randomized Point-Based Value Iteration
6.4.5 Point Selection
6.4.6 Linear Policies
6.5 Online Methods
6.5.1 Lookahead with Approximate Value Function
6.5.2 Forward Search
6.5.3 Branch and Bound
6.5.4 Monte Carlo Tree Search
6.6 Summary
6.7 Further Reading
References
7 Cooperative Decision Making
7.1 Formulation
7.1.1 Decentralized POMDPs
7.1.2 Example Problem
7.1.3 Solution Representations
7.2 Properties
7.2.1 Differences with POMDPs
7.2.2 Dec-POMDP Complexity
7.2.3 Generalized Belief States
7.3 Notable Subclasses
7.3.1 Dec-MDPs
7.3.2 ND-POMDPs
7.3.3 MMDPs
7.4 Exact Solution Methods
7.4.1 Dynamic Programming
7.4.2 Heuristic Search
7.4.3 Policy Iteration
7.5 Approximate Solution Methods
7.5.1 Memory-Bounded Dynamic Programming
7.5.2 Joint Equilibrium Search
7.6 Communication
7.7 Summary
7.8 Further Reading
References
8 Probabilistic Surveillance Video Search
8.1 Attribute-Based Person Search
8.1.1 Applications
8.1.2 Person Detection
8.1.3 Retrieval and Scoring
8.2 Probabilistic Appearance Model
8.2.1 Observed States
8.2.2 Basic Model Structure
8.2.3 Model Extensions
8.3 Learning and Inference Techniques
8.3.1 Paramete Learning
8.3.2 Hidden State Inference
8.3.3 Scoring Algorithm
8.4 Performance
8.4.1 Search Accuracy
8.4.2 Search Timing
8.5 Interactive Search Tool
8.6 Summary
References
9 Dynamic Models for Speech Applications
9.1 Modeling Speech Signals
9.1.1 Feature Extraction
9.1.2 Hidden Markov Models
9.1.3 Gaussian Mixture Models
9.1.4 Expectation-Maximization Algorithm
9.2 Speech Recognition
9.3 Topic Identification
9.4 Language Recognition
9.5 Speaker Identification
9.5.1 Forensic Speaker Recognition
9.6 Machine Translation
9.7 Summary
References
10 Optimized Airborne Collision Avoidance
10.1 Airborne Collision Avoidance Systems
10.1.1 Traffic Alert and Collision Avoidance System
10.1.2 Limitations Of Existing System
10.1.3 Unmanned Aircraft Sense and Avoid
10.1.4 Airborne Collision Avoidance System X
10.2 Collision Avoidance Problem Formulation
10.2.1 Resolution Advisories
10.2.2 Dynamic Model
10.2.3 Reward Function
10.2.4 Dynamic Programming
10.3 State Estimation
10.3.1 Sensor Error
10.3.2 Pilot Response
10.3.3 Time to Potential Collision
10.4 Real-Time Execution
10.4.1 Online Costs
10.4.2 Multiple Threats
10.4.3 Traffic Alerts
10.5 Evaluation
10.5.1 Safety Analysis
10.5.2 Operational Suitability and Acceptability
10.5.3 Parameter Tuning
10.5.4 Flight Test
10.6 Summary
References
11 Multiagent Planning for Persistent Surveillance
11.1 Mission Description
11.2 Centralized Problem Formulation
11.2.1 State Space
11.2.2 Action Space
11.2.3 State Transition Model
11.2.4 Reward Function
11.3 Decentralized Approximate Formulations
11.3.1 Factored Decomposition
11.3.2 Group Aggregate Decomposition
11.3.3 Planning
11.4 Model Learning
11.5 Flight Test
11.6 Summary
References
12 Integrating Automation with Humans
12.1 Human Capabilities and Coping
12.1.1 Perceptural and Cognitive Capabilities
12.1.2 Naturalistic Decision Making
12.2 Considering the Human in Design
12.2.1 Trust and Value of Decision Logic Transparency
12.2.2 Designing for Different Levels of Centainty
12.2.3 Supporting Decisions over Long Timescales
12.3 A Systems View of Implementation
12.3.1 Interface, Training, and Procedures
12.3.2 Measuring Decision Support Effectiveness
12.3.3 Organization Influences on System Effectiveness
12.4 Summary
References
Index
William P. Delaney Alan J. Fenn and Peter T. Hurst Mykel J. Kochenderfer Chaw-Bing Chang and Keh-Ping Dunn MIT Lincoln Laboratory is a federally funded research and development center that applies advanced technology to problems of national security. The books in the MIT Lincoln Laboratory Series cover a broad range of technology areas in which Lincoln Laboratory has made leading contributions. The books listed above and future volumes in this series renew the knowledge-sharing tradition established by the seminal MIT Radiation Laboratory Series published between 1947 and 1953.
To my family.
分享到:
收藏