The evolution of fuzzy rules as strategies in two-player games.

AuthorWest, James E.
  1. Introduction

    How to model players' strategies has been an important issue in the study of repealed games. The problem has been reasonably well addressed for the case in which stralegy choices in the stage game are finite because the history at any point in time consists only of a series of choices from a finite set of stage game strategies. For example, Miller (1996) and Linster (1992, 1994) used finite automata to encode strategies in the repeated prisoner's dilemma. These finite automata, or Moore machines, can capture the idea of bounded rationality, but they can only be used when strategy choices are finite in number.

    The purpose of this paper is to illustrate the use of fuzzy strategies in two familiar two-player games with continuous strategy spaces. We chose games for which analytic solutions are available so as to enable comparisons between equilibrium outcomes of our fuzzy models and Coumot-Nash, competitive, and collusive outcomes. These simulations based on fuzzy strategies can be used in models for which analytic solutions do not exist or are computationally very difficult to obtain.

    A rule described by a small number of parameters can express only limited rationality within the context of a complex model, but these rules might be "good enough" to describe human behavior in a variety of situations. Consider as an example the task of running an automobile's heating and cooling system prior to the advent of climate control. A heater temperature/air conditioner lever position is continuously and minutely variable. No doubt a complicated optimal nonlinear response function could be specified in which temperature readings are taken and minute adjustments are made to the lever position. Instead, we hypothesize that as few as three simple rules might be enough to adequately control the automobile's temperature. If the car is hot, turn on the air conditioning to maximum cooling. If the car is cold, turn on the heat to maximum heating. If the car temperature is "just right," do nothing. Interpolation over these three rules can specify a proper response to any automobile temperature. We propose tha t much as these three simple rules should eventually arrive at just the right temperature, strategies involving fuzzy rules can be used to find the equilibrium outcome in a variety of two-player games. The equilibrium will not be found with the precision of analytically derived optimal-response functions, but we believe our results illustrate the possibility for boundedly rational agents to achieve a near-equilibrium outcome. Although we consider exclusively two-player games, the techniques we illustrate can easily be extended to games with more players.

    This research has both theoretical and empirical components. From a theoretical perspective, this model captures some elements of boundedly rational behavior and adaptive learning at a very elementary level. The strategies are able to learn and adapt as the generations pass. This model has the advantages of enabling us to tractably model continuous strategies in discrete ways and allowing us to evaluate how these strategies evolve over time relative to predetermined benchmarks like Cournot-Nash equilibrium behavior, collusive behavior, and perfecfly competitive behavior. At an empirical level, this paper introduces new methods for performing economic simulations that can be modified for other situations.

    Consider the following thought experiment. Suppose two players are about to engage in a repeated version of a game in which the strategic choice comes from a continuous set. The players are not fully rational. Although they play the game repeatedly with the same opponent, they always predict their opponent's future behavior using a simple rule. The predictions may be incorrect, but these naive players never alter the way they make their predictions. Each player is required before the game to submit an initial choice, as well as rules for making a choice in the next period as a function of past play. Initially, the players have minimal information about payoffs, so the rules are randomly chosen. After each generation, the players' payoffs and the rules they used become common knowledge. The players may submit new rules, and the game is played again. What sort of rules should prosper in an environment like this?

    The current state of the art in the study of repeated games offers little hope of revealing what outcomes will emerge. The strategy choices available for an individual player in this environment are functions like [q.sub.t]: [0, N] [right arrow] [0, N]. That is, player i uses a rule [q.sub.i] to assign an output level in response to any possible level of output by his or her opponent. In order to model these functions, we employ fuzzy rules as strategies. The fuzzy rules are coded so that they can evolve according to a genetic algorithm.

    We assume that a player's strategy can be represented by a set of rules like, "If I think my opponent will choose action x, I will choose action y." We operationalize the idea of selecting strategies based on how well some players are doing with a genetic algorithm. Here, strategic choices are explored in the context of a simple duopoly game and a politically contestable rent game using fuzzy rules to define the strategy and a genetic algorithm to capture the notion of evolving strategies.

  2. Literature Review

    In considering how to model a game with an infinite strategy space, our first task was to determine how strategies should use information from opponents' past play. Axelrod (1984) published results of a computer tournment in which game theorists were invited to submit strategies to play in an iterated prisoner's dilemma game with the following properties:

  3. The interactions were between pairs of players.

  4. Each player had two available choices on each move: cooperate or defect. Choices were made simultaneously.

  5. The payoffs (Axelrod and Dion 1988) were fixed before play and announced to all players.

  6. At each move in the game, each player had access to the history of the game up to that move--in short, there was no noise in the transmission of strategy choices between players.

    The winning strategy for this tournament was the tit-for-tat strategy--to cooperate in the first round and mimic one's opponent's most recent behavior in subsequent rounds. Although an iterated prisoner's dilemma game has a finite set of choices (cooperate or defect) as opposed to the infinite set in the games we consider, we sought to emulate the properties of a tit-for-tat strategy in what we call a Cournot adaptive scheme. In a Cournot adaptive scheme, a strategy responds only to the immediate past action of its opponent. To experiment with the effect of additional information on the rate of convergence, we ran an additional set of simulations in which the strategy responds to the unweighted average or mean value of all past actions of its opponent. We call this the fictitious play adaptive scheme.

    In their survey of the evolution of cooperation, Axelrod and Dion (1988, p. 1388) address limitations of previous tournaments in the generation of new strategies:

    One method of overcoming this limitation is to use a genetic approach to develop new strategies for playing the IPD (iterated prisoner's dilemma). A good method of implementing this on a computer is Holland's (1975) genetic algorithm. A strategy can...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT