logo资料库

Multi-agent System.pdf

第1页 / 共532页
第2页 / 共532页
第3页 / 共532页
第4页 / 共532页
第5页 / 共532页
第6页 / 共532页
第7页 / 共532页
第8页 / 共532页
资料共532页,剩余部分请下载后查看
Credits and Acknowledgments
Introduction
Distributed Constraint Satisfaction
Defining distributed constraint satisfaction problems
Domain-pruning algorithms
Heuristic search algorithms
The asynchronous backtracking algorithm
A simple example
An extended example: the four queens problem
Beyond the ABT algorithm
History and references
Distributed Optimization
Distributed dynamic programming for path planning
Asynchronous dynamic programming
Learning real-time A*
Action selection in multiagent MDPs
Negotiation, auctions and optimization
From contract nets to auction-like optimization
The assignment problem and linear programming
The scheduling problem and integer programming
Social laws and conventions
History and references
Introduction to Noncooperative Game Theory: Games in Normal Form
Self-interested agents
Example: friends and enemies
Preferences and utility
Games in normal form
Example: the TCP user's game
Definition of games in normal form
More examples of normal-form games
Strategies in normal-form games
Analyzing games: from optimality to equilibrium
Pareto optimality
Defining best response and Nash equilibrium
Finding Nash equilibria
Nash's theorem: proving the existence of Nash equilibria
Further solution concepts for normal-form games
Maxmin and minmax strategies
Minimax regret
Removal of dominated strategies
Rationalizability
Correlated equilibrium
Trembling-hand perfect equilibrium
-Nash equilibrium
History and references
Computing Solution Concepts of Normal-Form Games
Computing Nash equilibria of two-player, zero-sum games
Computing Nash equilibria of two-player, general-sum games
Complexity of computing a sample Nash equilibrium
An LCP formulation and the Lemke–Howson algorithm
Searching the space of supports
Beyond sample equilibrium computation
Computing Nash equilibria of n-player, general-sum games
Computing maxmin and minmax strategies for two-player, general-sum games
Identifying dominated strategies
Domination by a pure strategy
Domination by a mixed strategy
Iterated dominance
Computing correlated equilibria
History and references
Games with Sequential Actions: Reasoning and Computing with the Extensive Form
Perfect-information extensive-form games
Definition
Strategies and equilibria
Subgame-perfect equilibrium
Computing equilibria: backward induction
Imperfect-information extensive-form games
Definition
Strategies and equilibria
Computing equilibria: the sequence form
Sequential equilibrium
History and references
Richer Representations: Beyond the Normal and Extensive Forms
Repeated games
Finitely repeated games
Infinitely repeated games
``Bounded rationality": repeated games played by automata
Stochastic games
Definition
Strategies and equilibria
Computing equilibria
Bayesian games
Definition
Strategies and equilibria
Computing equilibria
Ex post equilibrium
Congestion games
Definition
Computing equilibria
Potential games
Nonatomic congestion games
Selfish routing and the price of anarchy
Computationally motivated compact representations
The expected utility problem
Graphical games
Action-graph games
Multiagent influence diagrams
GALA
History and references
Learning and Teaching
Why the subject of ``learning'' is complex
The interaction between learning and teaching
What constitutes learning?
If learning is the answer, what is the question?
Fictitious play
Rational learning
Reinforcement learning
Learning in unknown MDPs
Reinforcement learning in zero-sum stochastic games
Beyond zero-sum stochastic games
Belief-based reinforcement learning
No-regret learning and universal consistency
Targeted learning
Evolutionary learning and other large-population models
The replicator dynamic
Evolutionarily stable strategies
Agent-based simulation and emergent conventions
History and references
Communication
``Doing by talking'' I: cheap talk
``Talking by doing'': signaling games
``Doing by talking'' II: speech-act theory
Speech acts
Rules of conversation
A game-theoretic view of speech acts
Applications
History and references
Aggregating Preferences: Social Choice
Introduction
Example: plurality voting
A formal model
Voting
Voting methods
Voting paradoxes
Existence of social functions
Social welfare functions
Social choice functions
Ranking systems
History and references
Protocols for Strategic Agents: Mechanism Design
Introduction
Example: strategic voting
Example: buying a shortest path
Mechanism design with unrestricted preferences
Implementation
The revelation principle
Impossibility of general, dominant-strategy implementation
Quasilinear preferences
Risk attitudes
Mechanism design in the quasilinear setting
Efficient mechanisms
Groves mechanisms
The VCG mechanism
VCG and individual rationality
VCG and weak budget balance
Drawbacks of VCG
Budget balance and efficiency
The AGV mechanism
Beyond efficiency
What else can be implemented in dominant strategies?
Tractable Groves mechanisms
Computational applications of mechanism design
Task scheduling
Bandwidth allocation in computer networks
Multicast cost sharing
Two-sided matching
Constrained mechanism design
Contracts
Bribes
Mediators
History and references
Protocols for Multiagent Resource Allocation: Auctions
Single-good auctions
Canonical auction families
Auctions as Bayesian mechanisms
Second-price, Japanese, and English auctions
First-price and Dutch auctions
Revenue equivalence
Risk attitudes
Auction variations
``Optimal'' (revenue-maximizing) auctions
Collusion
Interdependent values
Multiunit auctions
Canonical auction families
Single-unit demand
Beyond single-unit demand
Unlimited supply: random sampling auctions
Position auctions
Combinatorial auctions
Simple combinatorial auction mechanisms
The winner determination problem
Expressing a bid: bidding languages
Iterative mechanisms
A tractable mechanism
Exchanges
Two-sided auctions
Prediction markets
History and references
Teams of Selfish Agents: An Introduction to Coalitional Game Theory
Coalitional games with transferable utility
Definition
Examples
Classes of coalitional games
Analyzing coalitional games
The Shapley value
The core
Refining the core: -core, least core, and nucleolus
Compact representations of coalitional games
Weighted majority games and weighted voting games
Weighted graph games
Capturing synergies: a representation for superadditive games
A decomposition approach: multi-issue representation
A logical approach: marginal contribution nets
Further directions
Alternative coalitional game models
Advanced solution concepts
History and references
Logics of Knowledge and Belief
The partition model of knowledge
Muddy children and warring generals
Formalizing intuitions about the partition model
A detour to modal logic
Syntax
Semantics
Axiomatics
Modal logics with multiple modal operators
Remarks about first-order modal logic
S5: An axiomatic theory of the partition model
Common knowledge, and an application to distributed systems
Doing time, and an application to robotics
Termination conditions for motion planning
Coordinating robots
From knowledge to belief
Combining knowledge and belief (and revisiting knowledge)
History and references
Beyond Belief: Probability, Dynamics and Intention
Knowledge and probability
Dynamics of knowledge and belief
Belief revision
Beyond AGM: update, arbitration, fusion, and friends
Theories of belief change: a summary
Logic, games, and coalition logic
Towards a logic of ``intention''
Some preformal intuitions
The road to hell: elements of a formal theory of intention
Group intentions
History and references
Appendices: Technical Background
Probability Theory
Probabilistic models
Axioms of probability theory
Marginal probabilities
Conditional probabilities
Linear and Integer Programming
Linear programs
Integer programs
Markov Decision Problems (MDPs)
The model
Solving known MDPs via value iteration
Classical Logic
Propositional calculus
First-order logic
Bibliography
Index
MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations Yoav Shoham Stanford University Kevin Leyton-Brown University of British Columbia Revision 1.1 Multiagent Systems is copyright © Shoham and Leyton-Brown, 2009, 2010. This version is formatted differently than the book—and in particular has different page numbering—and has not been fully copy edited. Please treat the printed book as the definitive version. You are invited to use this electronic copy without restriction for on-screen viewing, but are requested to print it only under one of the following circumstances: • You live in a place that does not offer you access to the physical book; • The cost of the book is prohibitive for you; • You need only one or two chapters. Finally, we ask you not to link directly to the PDF or to distribute it electronically. Instead, we invite you to link to http://www.masfoundations.org. This will allow us to gauge the level of interest in the book and to update the PDF to keep it consistent with reprintings of the book.
i
To my wife Noa and my daughters Maia, Talia and Ella —YS To Jude —KLB
Contents Credits and Acknowledgments xi Introduction xiii 1 Distributed Constraint Satisfaction 1 4 8 Defining distributed constraint satisfaction problems Domain-pruning algorithms Heuristic search algorithms 1.3.1 1.3.2 1.3.3 1.3.4 History and references The asynchronous backtracking algorithm A simple example An extended example: the four queens problem Beyond the ABT algorithm 12 17 18 2 10 2 Distributed Optimization 19 13 19 1.1 1.2 1.3 1.4 2.1 2.2 2.3 2.4 2.5 3.1 3.2 19 20 Asynchronous dynamic programming Learning real-time A∗ Distributed dynamic programming for path planning 2.1.1 2.1.2 Action selection in multiagent MDPs Negotiation, auctions and optimization 2.3.1 2.3.2 2.3.3 Social laws and conventions History and references From contract nets to auction-like optimization The assignment problem and linear programming The scheduling problem and integer programming 22 28 46 44 28 30 36 3 Introduction to Noncooperative Game Theory: Games in Normal Form 47 Self-interested agents 3.1.1 3.1.2 Games in normal form 3.2.1 Example: friends and enemies 49 Preferences and utility Example: the TCP user’s game 47 54 48 54
iv Contents 61 Pareto optimality Defining best response and Nash equilibrium Finding Nash equilibria Nash’s theorem: proving the existence of Nash equilibria 63 62 65 Definition of games in normal form More examples of normal-form games Strategies in normal-form games 3.2.2 3.2.3 3.2.4 59 Analyzing games: from optimality to equilibrium 3.3.1 3.3.2 3.3.3 3.3.4 Further solution concepts for normal-form games 3.4.1 73 3.4.2 3.4.3 3.4.4 3.4.5 3.4.6 3.4.7 History and references Maxmin and minmax strategies Minimax regret Removal of dominated strategies Rationalizability 81 Correlated equilibrium Trembling-hand perfect equilibrium ǫ-Nash equilibrium 87 76 83 85 78 55 56 60 73 85 3.3 3.4 3.5 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5.2 4 Computing Solution Concepts of Normal-Form Games 89 91 91 89 Computing Nash equilibria of two-player, zero-sum games Computing Nash equilibria of two-player, general-sum games Complexity of computing a sample Nash equilibrium 4.2.1 An LCP formulation and the Lemke–Howson algorithm 4.2.2 Searching the space of supports 4.2.3 101 Beyond sample equilibrium computation 4.2.4 104 Computing Nash equilibria of n-player, general-sum games Computing maxmin and minmax strategies for two-player, general-sum games Identifying dominated strategies 4.5.1 4.5.2 4.5.3 Computing correlated equilibria History and references 115 108 Domination by a pure strategy Domination by a mixed strategy Iterated dominance 105 109 110 112 113 93 108 5 Games with Sequential Actions: Reasoning and Computing with the Extensive Form 117 5.1 117 118 Perfect-information extensive-form games 5.1.1 5.1.2 5.1.3 5.1.4 Imperfect-information extensive-form games 5.2.1 Definition Strategies and equilibria Subgame-perfect equilibrium Computing equilibria: backward induction Definition 119 121 130 130 124 Uncorrected manuscript of Multiagent Systems, published by Cambridge University Press Revision 1.1 © Shoham & Leyton-Brown, 2009, 2010.
Contents 5.3 5.2.2 5.2.3 5.2.4 History and references Strategies and equilibria Computing equilibria: the sequence form Sequential equilibrium 145 131 142 v 134 6 Richer Representations: Beyond the Normal and Extensive Forms 147 Finitely repeated games Infinitely repeated games “Bounded rationality": repeated games played by automata 149 150 153 7 Learning and Teaching 199 7.1 Why the subject of “learning” is complex 199 The interaction between learning and teaching If learning is the answer, what is the question? 201 7.1.1 7.1.2 What constitutes learning? 7.1.3 Fictitious play Rational learning Reinforcement learning 7.4.1 7.4.2 7.4.3 206 211 215 Learning in unknown MDPs Reinforcement learning in zero-sum stochastic games Beyond zero-sum stochastic games 215 219 216 Free for on-screen use; please do not distribute. You can get another free copy of this PDF or order the book at http://www.masfoundations.org. 164 167 170 160 162 160 159 163 148 Definition Strategies and equilibria Computing equilibria Definition Strategies and equilibria Computing equilibria Ex post equilibrium Repeated games 6.1.1 6.1.2 6.1.3 Stochastic games 6.2.1 6.2.2 6.2.3 Bayesian games 6.3.1 6.3.2 6.3.3 6.3.4 Congestion games 6.4.1 Definition Computing equilibria 6.4.2 Potential games 6.4.3 Nonatomic congestion games 6.4.4 178 Selfish routing and the price of anarchy 6.4.5 Computationally motivated compact representations 6.5.1 6.5.2 6.5.3 6.5.4 6.5.5 History and references The expected utility problem Graphical games Action-graph games Multiagent influence diagrams GALA 174 174 173 195 188 190 175 176 185 192 196 180 185 199 202 6.1 6.2 6.3 6.4 6.5 6.6 7.2 7.3 7.4
vi 7.5 7.6 7.7 7.8 222 220 220 7.4.4 Belief-based reinforcement learning No-regret learning and universal consistency Targeted learning Evolutionary learning and other large-population models 7.7.1 7.7.2 7.7.3 History and references The replicator dynamic 224 Evolutionarily stable strategies Agent-based simulation and emergent conventions 228 233 Contents 224 230 8 Communication 235 239 235 “Doing by talking” I: cheap talk “Talking by doing”: signaling games “Doing by talking” II: speech-act theory 8.3.1 Speech acts 242 8.3.2 Rules of conversation 8.3.3 A game-theoretic view of speech acts 8.3.4 Applications History and references 248 251 243 241 245 9 Aggregating Preferences: Social Choice 8.1 8.2 8.3 8.4 9.1 9.2 9.3 9.4 9.5 9.6 Example: plurality voting 254 253 Introduction 9.1.1 A formal model Voting 256 9.3.1 9.3.2 Existence of social functions 9.4.1 9.4.2 Ranking systems History and references Voting methods Voting paradoxes 267 Social welfare functions Social choice functions 256 258 260 271 253 253 260 263 10 Protocols for Strategic Agents: Mechanism Design 273 10.1 Introduction 273 10.1.1 10.1.2 Example: strategic voting Example: buying a shortest path 273 274 10.2 Mechanism design with unrestricted preferences 275 10.2.1 10.2.2 10.2.3 Implementation 276 The revelation principle Impossibility of general, dominant-strategy implementation 277 280 10.3 Quasilinear preferences 10.3.1 Risk attitudes 10.3.2 Mechanism design in the quasilinear setting 280 281 284 10.4 Efficient mechanisms 288 Uncorrected manuscript of Multiagent Systems, published by Cambridge University Press Revision 1.1 © Shoham & Leyton-Brown, 2009, 2010.
vii 303 Contents 288 The VCG mechanism 10.4.1 Groves mechanisms 10.4.2 292 10.4.3 VCG and individual rationality 10.4.4 VCG and weak budget balance 10.4.5 Drawbacks of VCG 10.4.6 Budget balance and efficiency 10.4.7 302 The AGV mechanism 297 295 296 301 10.5 Beyond efficiency 303 10.5.1 What else can be implemented in dominant strategies? 10.5.2 Tractable Groves mechanisms 305 10.6 Computational applications of mechanism design Task scheduling 10.6.1 10.6.2 Bandwidth allocation in computer networks 10.6.3 Multicast cost sharing 10.6.4 307 312 Two-sided matching 10.7 Constrained mechanism design 316 321 307 309 10.7.1 Contracts 10.7.2 Bribes 10.7.3 Mediators 322 323 324 10.8 History and references 326 11 Protocols for Multiagent Resource Allocation: Auctions 329 11.1 Single-good auctions 329 332 Second-price, Japanese, and English auctions First-price and Dutch auctions 337 11.1.1 Canonical auction families 330 11.1.2 Auctions as Bayesian mechanisms 11.1.3 11.1.4 11.1.5 Revenue equivalence 11.1.6 Risk attitudes 340 11.1.7 Auction variations 11.1.8 11.1.9 Collusion 11.1.10 Interdependent values “Optimal” (revenue-maximizing) auctions 335 341 345 348 11.2 Multiunit auctions 351 Single-unit demand 11.2.1 Canonical auction families 11.2.2 352 11.2.3 Beyond single-unit demand 11.2.4 Unlimited supply: random sampling auctions 11.2.5 Position auctions 351 355 359 11.3 Combinatorial auctions 361 11.3.1 11.3.2 11.3.3 11.3.4 Simple combinatorial auction mechanisms The winner determination problem 364 Expressing a bid: bidding languages Iterative mechanisms 373 368 333 343 357 363 Free for on-screen use; please do not distribute. You can get another free copy of this PDF or order the book at http://www.masfoundations.org.
分享到:
收藏