Multi-agent System.pdf

发布时间：2022-06-09 发布人：admin 分类：说明书资料大小：4.72M 资料格式：pdf 举报版权申诉

syx220-3620566-16359647553680758556.pdf-第1页.png

第1页 / 共532页

syx220-3620566-16359647553680758556.pdf-第2页.png

第2页 / 共532页

syx220-3620566-16359647553680758556.pdf-第3页.png

第3页 / 共532页

syx220-3620566-16359647553680758556.pdf-第4页.png

第4页 / 共532页

syx220-3620566-16359647553680758556.pdf-第5页.png

第5页 / 共532页

syx220-3620566-16359647553680758556.pdf-第6页.png

第6页 / 共532页

syx220-3620566-16359647553680758556.pdf-第7页.png

第7页 / 共532页

syx220-3620566-16359647553680758556.pdf-第8页.png

第8页 / 共532页

Credits and Acknowledgments

Introduction

Distributed Constraint Satisfaction

Defining distributed constraint satisfaction problems

Domain-pruning algorithms

Heuristic search algorithms

The asynchronous backtracking algorithm

A simple example

An extended example: the four queens problem

Beyond the ABT algorithm

History and references

Distributed Optimization

Distributed dynamic programming for path planning

Asynchronous dynamic programming

Learning real-time A*

Action selection in multiagent MDPs

Negotiation, auctions and optimization

From contract nets to auction-like optimization

The assignment problem and linear programming

The scheduling problem and integer programming

Social laws and conventions

History and references

Introduction to Noncooperative Game Theory: Games in Normal Form

Self-interested agents

Example: friends and enemies

Preferences and utility

Games in normal form

Example: the TCP user's game

Definition of games in normal form

More examples of normal-form games

Strategies in normal-form games

Analyzing games: from optimality to equilibrium

Pareto optimality

Defining best response and Nash equilibrium

Finding Nash equilibria

Nash's theorem: proving the existence of Nash equilibria

Further solution concepts for normal-form games

Maxmin and minmax strategies

Minimax regret

Removal of dominated strategies

Rationalizability

Correlated equilibrium

Trembling-hand perfect equilibrium

-Nash equilibrium

History and references

Computing Solution Concepts of Normal-Form Games

Computing Nash equilibria of two-player, zero-sum games

Computing Nash equilibria of two-player, general-sum games

Complexity of computing a sample Nash equilibrium

An LCP formulation and the Lemke–Howson algorithm

Searching the space of supports

Beyond sample equilibrium computation

Computing Nash equilibria of n-player, general-sum games

Computing maxmin and minmax strategies for two-player, general-sum games

Identifying dominated strategies

Domination by a pure strategy

Domination by a mixed strategy

Iterated dominance

Computing correlated equilibria

History and references

Games with Sequential Actions: Reasoning and Computing with the Extensive Form

Perfect-information extensive-form games

Definition

Strategies and equilibria

Subgame-perfect equilibrium

Computing equilibria: backward induction

Imperfect-information extensive-form games

Definition

Strategies and equilibria

Computing equilibria: the sequence form

Sequential equilibrium

History and references

Richer Representations: Beyond the Normal and Extensive Forms

Repeated games

Finitely repeated games

Infinitely repeated games

``Bounded rationality": repeated games played by automata

Stochastic games

Definition

Strategies and equilibria

Computing equilibria

Bayesian games

Definition

Strategies and equilibria

Computing equilibria

Ex post equilibrium

Congestion games

Definition

Computing equilibria

Potential games

Nonatomic congestion games

Selfish routing and the price of anarchy

Computationally motivated compact representations

The expected utility problem

Graphical games

Action-graph games

Multiagent influence diagrams

GALA

History and references

Learning and Teaching

Why the subject of ``learning'' is complex

The interaction between learning and teaching

What constitutes learning?

If learning is the answer, what is the question?

Fictitious play

Rational learning

Reinforcement learning

Learning in unknown MDPs

Reinforcement learning in zero-sum stochastic games

Beyond zero-sum stochastic games

Belief-based reinforcement learning

No-regret learning and universal consistency

Targeted learning

Evolutionary learning and other large-population models

The replicator dynamic

Evolutionarily stable strategies

Agent-based simulation and emergent conventions

History and references

Communication

``Doing by talking'' I: cheap talk

``Talking by doing'': signaling games

``Doing by talking'' II: speech-act theory

Speech acts

Rules of conversation

A game-theoretic view of speech acts

Applications

History and references

Aggregating Preferences: Social Choice

Introduction

Example: plurality voting

A formal model

Voting

Voting methods

Voting paradoxes

Existence of social functions

Social welfare functions

Social choice functions

Ranking systems

History and references

Protocols for Strategic Agents: Mechanism Design

Introduction

Example: strategic voting

Example: buying a shortest path

Mechanism design with unrestricted preferences

Implementation

The revelation principle

Impossibility of general, dominant-strategy implementation

Quasilinear preferences

Risk attitudes

Mechanism design in the quasilinear setting

Efficient mechanisms

Groves mechanisms

The VCG mechanism

VCG and individual rationality

VCG and weak budget balance

Drawbacks of VCG

Budget balance and efficiency

The AGV mechanism

Beyond efficiency

What else can be implemented in dominant strategies?

Tractable Groves mechanisms

Computational applications of mechanism design

Task scheduling

Bandwidth allocation in computer networks

Multicast cost sharing

Two-sided matching

Constrained mechanism design

Contracts

Bribes

Mediators

History and references

Protocols for Multiagent Resource Allocation: Auctions

Single-good auctions

Canonical auction families

Auctions as Bayesian mechanisms

Second-price, Japanese, and English auctions

First-price and Dutch auctions

Revenue equivalence

Risk attitudes

Auction variations

``Optimal'' (revenue-maximizing) auctions

Collusion

Interdependent values

Multiunit auctions

Canonical auction families

Single-unit demand

Beyond single-unit demand

Unlimited supply: random sampling auctions

Position auctions

Combinatorial auctions

Simple combinatorial auction mechanisms

The winner determination problem

Expressing a bid: bidding languages

Iterative mechanisms

A tractable mechanism

Exchanges

Two-sided auctions

Prediction markets

History and references

Teams of Selfish Agents: An Introduction to Coalitional Game Theory

Coalitional games with transferable utility

Definition

Examples

Classes of coalitional games

Analyzing coalitional games

The Shapley value

The core

Refining the core: -core, least core, and nucleolus

Compact representations of coalitional games

Weighted majority games and weighted voting games

Weighted graph games

Capturing synergies: a representation for superadditive games

A decomposition approach: multi-issue representation

A logical approach: marginal contribution nets

Further directions

Alternative coalitional game models

Advanced solution concepts

History and references

Logics of Knowledge and Belief

The partition model of knowledge

Muddy children and warring generals

Formalizing intuitions about the partition model

A detour to modal logic

Syntax

Semantics

Axiomatics

Modal logics with multiple modal operators

Remarks about first-order modal logic

S5: An axiomatic theory of the partition model

Common knowledge, and an application to distributed systems

Doing time, and an application to robotics

Termination conditions for motion planning

Coordinating robots

From knowledge to belief

Combining knowledge and belief (and revisiting knowledge)

History and references

Beyond Belief: Probability, Dynamics and Intention

Knowledge and probability

Dynamics of knowledge and belief

Belief revision

Beyond AGM: update, arbitration, fusion, and friends

Theories of belief change: a summary

Logic, games, and coalition logic

Towards a logic of ``intention''

Some preformal intuitions

The road to hell: elements of a formal theory of intention

Group intentions

History and references

Appendices: Technical Background

Probability Theory

Probabilistic models

Axioms of probability theory

Marginal probabilities

Conditional probabilities

Linear and Integer Programming

Linear programs

Integer programs

Markov Decision Problems (MDPs)

The model

Solving known MDPs via value iteration

Classical Logic

Propositional calculus

First-order logic

Bibliography

Index

MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations Yoav Shoham Stanford University Kevin Leyton-Brown University of British Columbia Revision 1.1 Multiagent Systems is copyright © Shoham and Leyton-Brown, 2009, 2010. This version is formatted differently than the book—and in particular has different page numbering—and has not been fully copy edited. Please treat the printed book as the deﬁnitive version. You are invited to use this electronic copy without restriction for on-screen viewing, but are requested to print it only under one of the following circumstances: • You live in a place that does not offer you access to the physical book; • The cost of the book is prohibitive for you; • You need only one or two chapters. Finally, we ask you not to link directly to the PDF or to distribute it electronically. Instead, we invite you to link to http://www.masfoundations.org. This will allow us to gauge the level of interest in the book and to update the PDF to keep it consistent with reprintings of the book.

To my wife Noa and my daughters Maia, Talia and Ella —YS To Jude —KLB

Contents Credits and Acknowledgments xi Introduction xiii 1 Distributed Constraint Satisfaction 1 4 8 Deﬁning distributed constraint satisfaction problems Domain-pruning algorithms Heuristic search algorithms 1.3.1 1.3.2 1.3.3 1.3.4 History and references The asynchronous backtracking algorithm A simple example An extended example: the four queens problem Beyond the ABT algorithm 12 17 18 2 10 2 Distributed Optimization 19 13 19 1.1 1.2 1.3 1.4 2.1 2.2 2.3 2.4 2.5 3.1 3.2 19 20 Asynchronous dynamic programming Learning real-time A∗ Distributed dynamic programming for path planning 2.1.1 2.1.2 Action selection in multiagent MDPs Negotiation, auctions and optimization 2.3.1 2.3.2 2.3.3 Social laws and conventions History and references From contract nets to auction-like optimization The assignment problem and linear programming The scheduling problem and integer programming 22 28 46 44 28 30 36 3 Introduction to Noncooperative Game Theory: Games in Normal Form 47 Self-interested agents 3.1.1 3.1.2 Games in normal form 3.2.1 Example: friends and enemies 49 Preferences and utility Example: the TCP user’s game 47 54 48 54

iv Contents 61 Pareto optimality Deﬁning best response and Nash equilibrium Finding Nash equilibria Nash’s theorem: proving the existence of Nash equilibria 63 62 65 Deﬁnition of games in normal form More examples of normal-form games Strategies in normal-form games 3.2.2 3.2.3 3.2.4 59 Analyzing games: from optimality to equilibrium 3.3.1 3.3.2 3.3.3 3.3.4 Further solution concepts for normal-form games 3.4.1 73 3.4.2 3.4.3 3.4.4 3.4.5 3.4.6 3.4.7 History and references Maxmin and minmax strategies Minimax regret Removal of dominated strategies Rationalizability 81 Correlated equilibrium Trembling-hand perfect equilibrium ǫ-Nash equilibrium 87 76 83 85 78 55 56 60 73 85 3.3 3.4 3.5 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5.2 4 Computing Solution Concepts of Normal-Form Games 89 91 91 89 Computing Nash equilibria of two-player, zero-sum games Computing Nash equilibria of two-player, general-sum games Complexity of computing a sample Nash equilibrium 4.2.1 An LCP formulation and the Lemke–Howson algorithm 4.2.2 Searching the space of supports 4.2.3 101 Beyond sample equilibrium computation 4.2.4 104 Computing Nash equilibria of n-player, general-sum games Computing maxmin and minmax strategies for two-player, general-sum games Identifying dominated strategies 4.5.1 4.5.2 4.5.3 Computing correlated equilibria History and references 115 108 Domination by a pure strategy Domination by a mixed strategy Iterated dominance 105 109 110 112 113 93 108 5 Games with Sequential Actions: Reasoning and Computing with the Extensive Form 117 5.1 117 118 Perfect-information extensive-form games 5.1.1 5.1.2 5.1.3 5.1.4 Imperfect-information extensive-form games 5.2.1 Deﬁnition Strategies and equilibria Subgame-perfect equilibrium Computing equilibria: backward induction Deﬁnition 119 121 130 130 124 Uncorrected manuscript of Multiagent Systems, published by Cambridge University Press Revision 1.1 © Shoham & Leyton-Brown, 2009, 2010.

Contents 5.3 5.2.2 5.2.3 5.2.4 History and references Strategies and equilibria Computing equilibria: the sequence form Sequential equilibrium 145 131 142 v 134 6 Richer Representations: Beyond the Normal and Extensive Forms 147 Finitely repeated games Inﬁnitely repeated games “Bounded rationality": repeated games played by automata 149 150 153 7 Learning and Teaching 199 7.1 Why the subject of “learning” is complex 199 The interaction between learning and teaching If learning is the answer, what is the question? 201 7.1.1 7.1.2 What constitutes learning? 7.1.3 Fictitious play Rational learning Reinforcement learning 7.4.1 7.4.2 7.4.3 206 211 215 Learning in unknown MDPs Reinforcement learning in zero-sum stochastic games Beyond zero-sum stochastic games 215 219 216 Free for on-screen use; please do not distribute. You can get another free copy of this PDF or order the book at http://www.masfoundations.org. 164 167 170 160 162 160 159 163 148 Deﬁnition Strategies and equilibria Computing equilibria Deﬁnition Strategies and equilibria Computing equilibria Ex post equilibrium Repeated games 6.1.1 6.1.2 6.1.3 Stochastic games 6.2.1 6.2.2 6.2.3 Bayesian games 6.3.1 6.3.2 6.3.3 6.3.4 Congestion games 6.4.1 Deﬁnition Computing equilibria 6.4.2 Potential games 6.4.3 Nonatomic congestion games 6.4.4 178 Selﬁsh routing and the price of anarchy 6.4.5 Computationally motivated compact representations 6.5.1 6.5.2 6.5.3 6.5.4 6.5.5 History and references The expected utility problem Graphical games Action-graph games Multiagent inﬂuence diagrams GALA 174 174 173 195 188 190 175 176 185 192 196 180 185 199 202 6.1 6.2 6.3 6.4 6.5 6.6 7.2 7.3 7.4

vi 7.5 7.6 7.7 7.8 222 220 220 7.4.4 Belief-based reinforcement learning No-regret learning and universal consistency Targeted learning Evolutionary learning and other large-population models 7.7.1 7.7.2 7.7.3 History and references The replicator dynamic 224 Evolutionarily stable strategies Agent-based simulation and emergent conventions 228 233 Contents 224 230 8 Communication 235 239 235 “Doing by talking” I: cheap talk “Talking by doing”: signaling games “Doing by talking” II: speech-act theory 8.3.1 Speech acts 242 8.3.2 Rules of conversation 8.3.3 A game-theoretic view of speech acts 8.3.4 Applications History and references 248 251 243 241 245 9 Aggregating Preferences: Social Choice 8.1 8.2 8.3 8.4 9.1 9.2 9.3 9.4 9.5 9.6 Example: plurality voting 254 253 Introduction 9.1.1 A formal model Voting 256 9.3.1 9.3.2 Existence of social functions 9.4.1 9.4.2 Ranking systems History and references Voting methods Voting paradoxes 267 Social welfare functions Social choice functions 256 258 260 271 253 253 260 263 10 Protocols for Strategic Agents: Mechanism Design 273 10.1 Introduction 273 10.1.1 10.1.2 Example: strategic voting Example: buying a shortest path 273 274 10.2 Mechanism design with unrestricted preferences 275 10.2.1 10.2.2 10.2.3 Implementation 276 The revelation principle Impossibility of general, dominant-strategy implementation 277 280 10.3 Quasilinear preferences 10.3.1 Risk attitudes 10.3.2 Mechanism design in the quasilinear setting 280 281 284 10.4 Efﬁcient mechanisms 288 Uncorrected manuscript of Multiagent Systems, published by Cambridge University Press Revision 1.1 © Shoham & Leyton-Brown, 2009, 2010.

vii 303 Contents 288 The VCG mechanism 10.4.1 Groves mechanisms 10.4.2 292 10.4.3 VCG and individual rationality 10.4.4 VCG and weak budget balance 10.4.5 Drawbacks of VCG 10.4.6 Budget balance and efﬁciency 10.4.7 302 The AGV mechanism 297 295 296 301 10.5 Beyond efﬁciency 303 10.5.1 What else can be implemented in dominant strategies? 10.5.2 Tractable Groves mechanisms 305 10.6 Computational applications of mechanism design Task scheduling 10.6.1 10.6.2 Bandwidth allocation in computer networks 10.6.3 Multicast cost sharing 10.6.4 307 312 Two-sided matching 10.7 Constrained mechanism design 316 321 307 309 10.7.1 Contracts 10.7.2 Bribes 10.7.3 Mediators 322 323 324 10.8 History and references 326 11 Protocols for Multiagent Resource Allocation: Auctions 329 11.1 Single-good auctions 329 332 Second-price, Japanese, and English auctions First-price and Dutch auctions 337 11.1.1 Canonical auction families 330 11.1.2 Auctions as Bayesian mechanisms 11.1.3 11.1.4 11.1.5 Revenue equivalence 11.1.6 Risk attitudes 340 11.1.7 Auction variations 11.1.8 11.1.9 Collusion 11.1.10 Interdependent values “Optimal” (revenue-maximizing) auctions 335 341 345 348 11.2 Multiunit auctions 351 Single-unit demand 11.2.1 Canonical auction families 11.2.2 352 11.2.3 Beyond single-unit demand 11.2.4 Unlimited supply: random sampling auctions 11.2.5 Position auctions 351 355 359 11.3 Combinatorial auctions 361 11.3.1 11.3.2 11.3.3 11.3.4 Simple combinatorial auction mechanisms The winner determination problem 364 Expressing a bid: bidding languages Iterative mechanisms 373 368 333 343 357 363 Free for on-screen use; please do not distribute. You can get another free copy of this PDF or order the book at http://www.masfoundations.org.

分享到：

赞收藏

资料库

Multi-agent System.pdf

相关推荐

课程资源

热门标签

最新资料