1
1.
Introduction
1.1. Introduction
This book is about the theory of learning in games. Most of non-cooperative game
theory has focused on equilibrium in games, especially Nash equilibrium, and its
refinements such as perfection. This raises the question of when and why we might expect
that observed play in a game will correspond to one of these equilibria. One traditional
explanation of equilibrium is that it results from analysis and introspection by the players
in a situation where the rules of the game, the rationality of the players, and the players’
payoff functions are all common knowledge. Both conceptually and empirically, these
theories have many problems.1
This book develops the alternative explanation that equilibrium arises as the long-
run outcome of a process in which less than fully rational players grope for optimality over
time. The models we will discuss serve to provide a foundation for equilibrium theory.
This is not to say that learning models provide foundations for all of the equilibrium
concepts in the literature, nor does it argue for the use of Nash equilibrium in every
situation; indeed, in some cases most learning models do not lead to any equilibrium
concept beyond the very weak notion of rationalizability. Nevertheless, learning models
1 First, a major conceptual problem occurs when there are multiple equilibria, for in the absence of an
explanation of how players come to expect the same equilibrium, their play need not correspond to any
equilibrium at all. While it is possible that players coordinate their expectations using a common selection
procedure such as Harsanyi and Selten’s [1988] tracing procedure, left unexplained is how such a procedure
comes to be common knowledge. Second, we doubt that the hypothesis of exact common knowledge of
payoffs and rationality apply to many games, and relaxing this to an assumption of almost common
knowledge yields much weaker conclusions. (See for example. Dekel and Fudenberg [1990], Borgers
[1994].) Third, equilibrium theory does a poor job explaining play in early rounds of most experiments,
although it does much better in later rounds.. This shift from non-equilibrium to equilibrium play is difficult
to reconcile with a purely introspective theory.
2
can suggest useful ways to evaluate and modify the traditional equilibrium concepts.
Learning models lead to refinements of Nash equilibrium: for example, considerations of
the long-run stochastic properties of the learning process suggest that risk dominant
equilibria will be observed in some games. They lead also to descriptions of long-run
behavior weaker than Nash equilibrium: for example considerations of the inability of
players in extensive form games to observe how opponents would have responded to
events that did not occur suggests that self-confirming equilibria that are not Nash may be
observed as the long-run behavior in some games.
We should acknowledge that the learning processes we analyze need not converge,
and even when they do converge the time needed for convergence is in some cases quite
long. One branch of the literature uses these facts to argue that it may be difficult to reach
equilibrium, especially in the short run. We downplay this anti-equilibrium argument for
several reasons. First, our impression is that there are some interesting economic situations
in which most of the participants seem to have a pretty good idea of what to expect from
day to day, perhaps because the social arrangements and social norms that we observe
reflect a process of thousands of years of learning from the experiences of past
generations. Second, although there are interesting periods in which social norms change
so suddenly that they break down, as for example during the transition from a controlled
economy to a market one, the dynamic learning models that have been developed so far
seem unlikely to provide much insight about the medium-term behavior that will occur in
these circumstances.2 Third, learning theories often have little to say in the short run,
making predictions that are highly dependent on details of the learning process and prior
beliefs; the long-run predictions are generally more robust to the specification of the
2 However, Boylan and El-Gamal [1993], Crawford [1995], Roth and Er’ev [1995], Er’ev and Roth [1996],
Nagel [1993], and Stahl [1994] use theoretical learning models to try to explain data on short-term and
medium-term play in game theory experiments.