# An Extension

In some ways the probabilistic repeated play model seems very plausible. After all, retaliation is a matter of common experience. However, it allows very little for changing circumstances outside the control of the agents in the game. Suppose, for example, that two firms play according to the tit- for-tat rule for a number of years, and then it becomes known that one of them is financially impaired and may go bankrupt. As a result, it seems far less probable that there will be further rounds of “play,” and the cooperative agreement breaks down. The repeated play model, with its constant discount factor, does not seem to allow for this sort of possibility. This section will sketch a modest extension of the model that will “realistically” allow for such changes of circumstances to affect the continuation of cooperation.

A key tool for this purpose is the state-transition matrix. We suppose, for example, that there are just three possible states of the world: state 1, in which both firms are financially sound and the “game” of price competition takes place; state 2, in which the “game” takes place but one firm is financially impaired; and state 3, in which there is no play, perhaps because one firm has gone bankrupt. There are two players. In states 1 and 2 they play Game 6.6. In state 3 they do not interact at all, and payoffs for both players are zero.

Given that the world is in state *i* in period t, the probabilities^{9} that the world will be in state *j* in period *t +* 1 are known constants summarized in the state transition matrix. Suppose the probabilities are as shown in Table 6.2.

*Table 6.2 Transition matrix 1*

Transition to |
||||

1 |
2 |
3 |
||

Transition |
1 |
0.8 |
0.2 |
0 |

from |
2 |
0.6 |
0.2 |
0.2 |

3 |
0.1 |
0.1 |
0.8 |

The number in a given cell tells us the probability that the state represented by the row will be succeeded by the state represented by the column. Thus, for example, this transition matrix tells us that state 1 will be followed by state 1, 80 percent of the time, by state 2, 20 percent of the time, but never directly followed by state 3. Nevertheless, we might see the system in state 1 in the first period, in state 2 in the second period (with 20 percent probability) and in state 3 in the third period. The probability that the system would transit from 1 to state 3 so quickly is the compound probability, 0.2 * 0.2 = 0.04 - a small probability. But, given more time, the probability could be greater since there are very many more ways that the transition could occur. Using compound probabilities, we can compute the probability that any one of the states will occur in any future period, starting out from state 1 (or indeed any other state). For example, the probability that we will observe state 1 steadily approaches a stable value of 0.64; and similarly for the other states approaches the constants 0.18, 0.18. In fact, many such models have equilibria of this kind, and the equilibria can be found by a fairly simple exercise in linear algebra, solving a system of three equations with the three constant probabilities as the three unknowns. We shall skip the details. We can also compute the probability of yet another round of play, that is, the probability that either state 1 or state 2 will occur in the period *n*, if play took place in period *n* - 1. (This reflects the probability both that state 1 or 2 will occur in period *n* - 1 and the probabilities in the state transition matrix.) This approaches a constant value of 0.78.

For a case like this, we might just take the equilibrium probabilities and treat the model as if it had constant probabilities, at least as a first approximation. Let us do that, asking whether tit-for-tat play will deter defection in Game 6.6. We find that if the probability of yet another round of play is greater than 0.675, indeed it will. If we begin from state 1, the probability of another round of play is greater than 0.675 in *every single period,* so we can be confident that cooperation is feasible based on the tit-for-tat strategy rule.

*Table 6.3 Transition matrix 2*

Transition to |
||||

1 |
2 |
3 |
||

Transition |
1 |
0.8 |
0.2 |
0 |

from |
2 |
0.4 |
0.2 |
0.4 |

3 |
0 |
0 |
1 |

For our example an advantage of this approach is that state-transition models can represent irreversible events, such as bankruptcy and death. Consider the transition matrix in Table 6.3. Row 2 tells us that when a firm is financially impaired, it will return to financial health with a probability of 0.4, go bankrupt with probability 0.4, or continue impaired in the next period with probability 0.2. As for the third row, it reminds us that liquidation is irreversible: once you are dead you stay dead, and the probability of coming back from the dead is zero.

If we repeat the enumeration of the probabilities of the three states for future periods, beginning in state 1, we see that the probability that state 3 will be observed approaches 1, and the probabilities of the other states, and of another round of play, approach zero. That is, state 3 is what is called an “absorbing state:” sooner or later we are all dead. As a result, the probability of another round of play keeps dropping and approaches 0 in the limit. It may seem that we can apply backward induction so that there will be no cooperation.

However, this is a mistake, or at least hasty. Assume that the agents can observe the state of the world. At the very least, agents will be able to tell whether anyone is bankrupt or not. We will assume that they can also observe whether they are in state 1 or state 2. Therefore, they can make their strategies contingent on the state. Thus, in place of tit-for-tat, suppose both parties play according to Rule 1:

*Rule 1. Cooperate IF the state is 1 AND (it is the first round of play OR the state in the previous period was other than state 1 OR the other agent played cooperate on previous round) ELSE defect.*

Now suppose we are at state 1 and one player defects on the current round, planning on returning to “cooperate” thereafter. His expected payoff is 10 + 3*0.8 + 4*0.2 = 13.2. On the other hand if he cooperates the expected payoff is 7 + 7*0.8 + 4*0.2 = 13.4. Cooperation pays better and defection is deterred. The term 4*0.2 is the payoff of the mutual defection that is sure to occur in case state 2 is realized in the next period times the probability that this will happen. (The example does not allow for time discounting and with time discounting the result might be different.) In this case the probabilities 0.8 and 0.2 are always applicable because they are conditional probabilities, conditional on the observation that state 1 has occurred.

Suppose instead that state 2 has been realized. Then the conditional probabilities of state 1 and state 2 in the following period are 0.4 and 0.2. Suppose the player defects once while the other player plays Rule 1. Then the defector’s expectation is 10 + 7*0.4 + 4*0.2 + 0*0.4 = 13.6. If he plays cooperate it will be 7 + 7*0.4 + 4*0.2 + 0*0.4 = 10.6. (Since there is no play in state 3 we assign payoffs of zero.) Cooperation does not pay.

Thus, whatever state occurs, there is no incentive to deviate from Rule 1 - Rule 1 is subgame perfect. (Here, again, we are assuming the rate of time discount is sufficiently small.) But notice what it means. We start from state 1, with cooperation. Over the next few rounds, the probability (as seen from period 1) that we will remain in state 1 declines. We can foresee that within several rounds, with high probability, the system will transit to state 2, and at that point cooperation will break down. If the firm in trouble manages to return to financial health (the system transits back to state 1), cooperation will be resumed. On the other hand, if one firm is liquidated, there will be no more opportunities for cooperation; and since this will occur sooner or later, cooperation will be repeated only a finite number of times.

There could be a range of other applications and contingent rules. For example, the agents might be playing different games in different states, with play in one game contingent on the other’s strategies either in the last play of the game now being played, or in the last play of the other game, or both.

# Interim Summary

We see that repeated play can be a link from Nash equilibrium to cooperative play. If agents are involved in interactions that are likely to be repeated, and the agents are patient enough and have some foresight, then cooperative play may emerge as one of the equilibria in a supergame, that is, a game repeated for an indefinite number of times.