Evolutionary stability in the infinitely repeated prisoners' dilemma played by two-state Moore machines.

AuthorLinster, Bruce G.
  1. Introduction

    A large literature which attempts to explain how cooperative outcomes be supported in the repeated Prisoners' Dilemma (RPD) has emerged in recent years. An interesting branch of this literature has analyzed the infinite RPD as a game in which two "metaplayers" choose a strategy which is implemented by a finite automation, or Moore machine. This line of research has been especially interesting because it allows us to capture the notion of "bounded" rationality in the players. Also, the use of these machines as devices to implement strategies permits us to quantify the notion of strategic complexity. We can then apply the idea that complexity is costly to the analysis, and the results have proven very interesting.

    So far, the work in this area has applied the concepts of Nash equilibrium and evolutionary stability in determining what reasonable outcomes are in these games. However, it has frequently been that pointed out these equilibrium concepts may be either too restrictive or not restrictive enough to be useful in analyzing the infinitely repeated Prisoners' Dilemma. That is, if we use the Nash equilibrium as the appropriate equilibrium concept, the set of equilibrium outcomes is infinitely large. On the other hand, if we require the equilibrium to be evolutionarily stable, the set of such equilibrium outcomes is frequently empty. It has been suggested that the requirement for evolutionary stability may be too stringent a criterion.

    The purpose of this essay is to examine what happens in these games in an evolutionary framework under various conditions. The results of the simulations I performed and report here indicate certain combinations of strategies may exist which, although not evolutionary stable in the sense of Maynard Smith [14], prove to be invasion-proof against permissible mutant strategies. This is akin to saying a set of possible strategies may exist which cannot be invaded, but neither the individual strategies nor the mixed strategy represented by the population is evolutionarily stable. We can imagine the mix strategies changing as various mutants attempt to invade the population but the set of strategies remaining the same as the one with which we started. In fact, it may well be the case that this set of strategies will not last forever, but it may last for a very long time. I formalize this idea later in the essay. In order to test the robustness of the hypothesis, a number of different mutation schemes were simulated to examine what sort of outcome we should expect to find in a population similar to the one I explore.

    I begin by reviewing the applicable literature and discussing the philosophy underlying this exercise. Then I describes the set of automata I consider as well as the various evolutionary model I simulated. As always, the results we obtain depend upon the assumptions of the model. The assumptions in this case include the set of permissible strategy implementing machines and the rules for how these strategic evolve over time. This line of research differs from other attempts to capture the results of an evolutionary story in two important ways. First,since the strategies I consider are all those which can be implemented by an automation with not more than two states, there is no predisposition, either intentional or unintentional, toward a given outcome. The claim can be made that the strategies which were entered in Robert Axelrod's now famous Prisoners' Dilemma tournament [3] were in some sense biased toward the TIT-FOR-TAT (TFT) type strategies because Axelrod identified TFT as the most successful strategy in a preliminary round of the tournament. The simulations I performed are not predisposed toward a particular outcome except through the evolutionary processes I use, and these are made explicit so we can either them or reject them on more reasonable grounds. Second, I implement different schemes for mutation which seem plausible in a sense I will explain later. Using Axelrod's tournament as an example again, we can imagine mutation entering his dynamic ecological process. Such mutant strategies could easily effect the final population in significant ways. However, what a reasonable mutant looks like in Axelrod's framework is not clear. In this essay, I formalize how the strategies are modeled so we can then evaluate the results based on the underlying assumptions.

  2. Cooperation and the Infinite RPD

    The Prisoner's Dilemma is well known bimatrix game which has been used to study such economic problems as oligopolistic collusion, international trade, and public goods provision. The players in the game must simultaneously choose between " Cooperate" (C) and "Defect: (D). The payoffs I will use in this analysis are described in Figure I. Regardless of what one's opponent does, a player does better by defecting. This means (D,D) is the only Nash equilibrium in the one-shot game. The dilemma is that if the player could be induced to cooperate they would both do better than if they receive the equilibrium payoff. In fact, any finitely repeated version of the Prisoners' Dilemma has defection at every stage as the only perfect equilibrium outcome. It is not until we consider the infinitely repeated game that cooperation can be sustained. The "Folk Theorem" of repeated games tells us any individually rational payoff vector can be the outcome of a perfect equilibrium of an infinitely repeated game when the payoffs are calculated as "the limit of the mean."(1)

    John Maynard Smith and G. R. Price [15] are among those credited with the introduction of evolutionary game theory, but perhaps the most well known of the theoretical works on the subject is Maynard Smith's 1082 book Evolution and the Theory of Games. Axelrod's [3] analysis of the success of cooperation in the repeated Prisoners' Dilemma (RPD) is probably the most famous of many studies of the game in an evolutionary framework. Evolutionary game theory has frequently been used to study coordination games, common interest games, and the Prisoners' Dilemma. Since its introduction many people in a broad range of disciplines have applied it to the study of natural and social science problems. Here I provide a brief review of some of those studies, especially those analyses which deal with the Prisoners' Dilemma.

    Robert Axelrod, in his 1984 book The Evolution of Cooperation, suggested cooperation is evolutionary stable in the RPD. However, the definition of evolutionary stability he employed differs from that used by Maynard Smith [4]. Specifically, Axelrod applied a concept he called "collective stability" which requires only that a strategy be a Nash equilibrium when it plays itself. In choosing this definition, Axelrod disallowed the possibility an alternate best reply strategy, which earns an equal payoff against both the indigenous and invading strategies, would be able to successfully infiltrate the native population.

    This idea is embodied in the formal definition of an evolutionary stable strategy (ESS). A strategy [Sigma], which can be either a pure or mixed strategy, is an ESS of a symmetric game if and only if: u([Sigma], [Sigma]) [is greater than or equal to] u([Sigma]', [Sigma]) and [Mathematical Expression Omitted] for all [Sigma]' [is not equal to] [Sigma], where u([[Sigma].sub.1] [Sigma].sub.2]) represents the payoff to a player if he plays strategy [Sigma].sub.1] and his opponent chooses strategy [Sigma].sub.2]. In words, these conditions require an ESS be a best reply to itself, and if there are alternate best replies, the ESS must do strictly better against an alternate best reply than the alternate best reply does against itself. The first requirement above means any ESS is a Nash equilibrium. However, the converse is not true because of the second condition, often called the "stability condition," which Axelrod eliminated as part of his definition of evolutionary stability [3, 217]. Therefore, it is in this weaker sense that TFT is stable. There were a number of possible strategies which met this requirement but did not do well in Axelrod's

    The above definition of an ESS is actually a characterization of a more basic requirement which holds in pairwise random matching models.(2) Assuming von Neumann-Morgenstern utility functions, the following must hold for some sufficiently small proportion of invaders, [Epsilon] > O: [Mathematical Expression Omitted] This requires the expected payoff from playing [Sigma] in a population with proportion of [Epsilon] of [Sigma]' is strictly greater than the expected utility of those playing [Sigma]'. Loosely speaking, then, an ESS is a strategy which cannot be invaded by an arbitrary small proportion of potential invading strategies.

    Robert Boyd and Jeffrey Lorberbaum [7] have shown no pure strategy is evolutionary stable in this more restricted sense in the infinite RPD. Specifically, they show a population of TFT players can be invaded by the appropriate mixture of Suspicious TIT-FOR-TAT, which is the strategy which defects on the first move and then reciprocates the opponent's move, and TIT-FOR-TWO-TATS (TF2T), which reciprocates defection only after two consecutive defections by the opponent. Moreover, they show that if every strategy is possible through mutation, no pure strategy is immune to invasion by some mixture of strategies. However, they suggest every strategy will not be possible through mutation, so cooperation based on reciprocity may indeed flourish. Joseph Farrell and Roger Ware [8] extended the work of Boyd and Lorberbaum to show that for any evolutionary stable mixture of strategies in the infinite RPD, every finite history must occur with positive probability. The implication of this result is no finite mixture of strategies is evolutionary stable in the infinite RPD. They use this result as evidence the ESS concept is too stringent a criterion to be useful.

    More recently, Yong-Gwan Kim [12] showed that an ESS in the infinite...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT