ANARCHY AND GAME THEORY Doug Newdick. 1. Introduction. In any discussion of anarchism, or the conditions for a stateless society, sooner or later a claim like this surfaces: " people are too selfish for that to work". These, I believe, are based upon an assumption (or theory) about human nature, that is taken to be evidently true, rather than argued for. Often I hear a version of "I'm sorry but I just have a more pessimistic view of people than you." This purpose of this essay is to show that even if we grant the assumptions of selfish rationality then cooperation without the state is still a possibility. 2. The anti-anarchist/Hobbesian argument. 2.1. The intuitive argument. With these sorts of objections to anarchism ("people are to selfish to cooperate without laws" etc) I think people are tacitly appealing to an argument of the form: 1 People are selfish (rational egoists). 2. Selfish people won't cooperate if they aren't forced to. 3. Anarchism involves the absence of force. 4. Therefore people won't cooperate in an anarchy. The opponent of anarchism can then say either; as anarchy also requires cooperation, it involves a contradiction; or, a society without cooperation would be awful, therefore an anarchy would be awful. 2.2. Taylor's (1987) version. If we call the two options (strategies) available to the individual cooperation (C) and defection (D)(non-cooperation) then we can see the similarities between the intuitive argument and Taylor's (1987) interpretation of Hobbes (1968) argument for the necessity for, or justification of, the state: "(a) in the absence of any coercion, it is in each individual's interest to choose strategy D ; the outcome of the game is therefore mutual Defection; but every individual prefers the mutual Cooperation outcome; (b) the only way to ensure that the preferred outcome is obtained is to establish a government with sufficient power to ensure that it is in every man's interest to choose C ." (Taylor 1987: 17) We can see from this that the argument appears to be formalisable in terms of Game Theory, specifically in the form of a prisoners' dilemma game. 3. The prisoners' dilemma. 3.1 The prisoners' dilemma.[1] To say an individual is rational, in this context, is to say that she maximises her payoffs. If an individual is egoistic (ie selfish) then his payoff is solely in terms of his own utility. Thus the rational egoist will choose those outcomes which have the highest utility for herself. In the traditional illustration of the prisoners' dilemma, two criminals have committed a heinous crime and have been captured by the police. The police know that the two individuals have committed this crime, but do not have enough evidence to convict them. However the police do have enough evidence to convict them of a lesser offence. The police (and perhaps a clever prosecuting attorney) separate the two thugs and offer them each a deal. The criminals each have two options: to remain quiet or to squeal on their partner in crime. If they squeal on their companion and their companion remains quiet they will get off, if both squeal they will receive medium sentences, if they remain quiet and their companion squeals they will receive the heaviest sentence, and if neither squeals then they will each receive light sentences. The two are unable to communicate with each other, and must make their decisions in ignorance of the other's choice. There are four possible outcomes for each player in this game: getting off scot free, which we will say has a utility of 4; getting a light sentence, which has a utility of 3; getting a medium sentence, which has a utility of 2; and getting a heavy sentence, which has a utility of 1. If we label the strategy of staying quiet "C" (for Cooperation), and label the strategy of squealing "D" (for Defection), then we get the following payoff matrix: Player 2 C D Player C 3, 3 1, 4 1 D 4, 1 2, 2 (where each pair of payoffs is ordered: Player 1, Player 2) It is obvious from this that no matter which strategy the other player chooses each player is better off to Defect, therefore the rational choice is to Defect (in Game-theory-speak Defection is the dominant strategy). As this is the case for both players, the outcome of the game will be mutual Defection. However there is an outcome, mutual Cooperation, which both players prefer, but because they are rational egoists they cannot obtain that outcome. This is the prisoners' dilemma. More generally a prisoners' dilemma is a game with a payoff matrix of the form: C D C x, x z, y D y, z w, w Where y > x > w > z. (The convention is that the rows are chosen by player 1, the columns by player 2, and the payoffs are ordered "player 1, player 2".) (Taylor 1987: 14) Any situation where the players' preferences can be modelled by this matrix is a prisoners' dilemma. 3.2 Ramifications of the prisoners' dilemma. Many people have proposed that the prisoners' dilemma is a good analysis of the provision of public goods and/or collective action problems in general, they have taken the preferences of individuals in cooperative enterprises to be modelled by a prisoners' dilemma. Firstly, the prisoners' dilemma gives an interesting look at so-called "free rider" problems in the provision of public goods. In public goods interactions, free rider problems emerge when a good is produced by a collectivity, and members of the collectivity cannot be prevented from consuming that good (in Taylor's terminology the good is non-excludable).[2] In this case a rational individual would prefer to reap the benefits of the good and to not contribute to its provision (ie Defect), thus if others Cooperate then the individual should Defect, and if everyone else Defects then the individual should Defect.[3] Secondly, the prisoners' dilemma is taken to be a good model of the preferences of individuals in their daily interactions with other individuals, such as fulfilling (or not fulfilling) contractual obligations, repaying debts, and other reciprocal interactions. 3.3 My version of the anti-anarchist argument. Given a game-theoretic interpretation of the claim in 1, and consequently a game-theoretic interpretation of the intuitive and Hobbesian arguments for the necessity of the state, we can reformulate them with the following argument: 1. People are egoistic rational agents. 2. If people are egoistic rational agents then the provision of public goods is a Prisoners' Dilemma (PD). 3. If the provision of public goods is a PD then, in the absence of coercion, public goods won't be provided. 4. Such coercion can only be provided by the state, not by an anarchy. 5. Therefore public goods won't be provided in an anarchy. 6. Therefore the state is necessary for the provision of public goods. 7. The provision of public goods is necessary for a "good" society. 8. Therefore an anarchy won't be a "good" society. 9. Therefore the state is necessary for a "good" society. 4. Overview of my criticisms/position. I think the game-theoretic model is the best (and most plausible) way of interpreting these sorts of arguments. However I think that its premises 1 to 4 are false. Against premise 2, following Taylor (1987: ch 2), I argue that the prisoner's dilemma is not the only plausible preference ordering for collective action, and in some of these different games Cooperation is more likely than in the prisoners' dilemma. The static model of the prisoners' dilemma game is unrealistic in that most social interactions reoccur, thus I argue a more realistic model is that of an iterated prisoners' dilemma, where cooperation (under certain circumstances) is in fact the optimal strategy (following Taylor 1987, and Axelrod 1984), thus 3 is argued to be false. Finally I argue that premise 1 is false, that indeed we do and should expect people to be (somewhat limited) altruists.[4] 5. Provision of public goods isn't always a prisoners' dilemma. For a game to be a prisoners' dilemma, it must fulfil certain conditions: "each player must (a) prefer non-Cooperation if the other player does not Cooperate, (b) prefer non-Cooperation if the other player does Cooperate. In other words: (a') neither individual finds it profitable to provide any of the public good by himself; and (b') the value to a player of the amount of the public good provided by the other player alone (ie, the value of being a free rider) exceeds the value to him of the total amount of the public good provided by joint Cooperation less his costs of Cooperation." (Taylor 1987: 35) For many public good situations either (a'), (b') or both fail to obtain. 5.1 Chicken games. If condition (a') fails we can get what Taylor calls a Chicken game; ie, if we get a situation where it pays a player to provide the public good even if the other player Defects, but both players would prefer to let the other player provide the good, we get this payoff matrix: C D C 3, 3 2, 4 D 4, 2 1, 1 Taylor (1987: 36) gives an example of two neighbouring farms maintaining an irrigation system, where the result of mutual Defection is so disastrous that either individual would prefer to maintain the system herself. Thus this game will model certain kinds of reciprocal arrangements that are not appropriately modelled by a prisoners' dilemma game. 5.2 Assurance games. If condition (b') fails to obtain we can get what Taylor (1987:38) calls an Assurance game, that is, a situation where neither player can provide a sufficient amount of the good if they contribute alone, thus for each player, if the other Defects then she should also Defect, but if the other Cooperates then she would prefer to Cooperate as well. Thus the payoff matrix looks like this: C D C 4, 4 1, 2 D 2, 1 3, 3 5.3 Cooperation in a Chicken or Assurance game. There should be no problem with mutual Cooperation in an Assurance game (Taylor 1987: 39) because the preferred outcome for both players is that of mutual Cooperation. With the one-off Chicken game mutual Cooperation is not assured, however, mutual Cooperation is more likely than in a one-off prisoners' dilemma. [5] 6. Cooperation is rational in an iterated prisoners' dilemma. 6.1 Why iteration. Unequivocally there is no chance for mutual Cooperation in a one-off prisoners' dilemma, but as has been pointed out, the one-off game is not a very realistic model of social interactions, especially public good interactions (Taylor 1987: 60). Most social interactions involve repeated interactions , sometimes as a group (an N-person game), or between specific individuals (which might be modelled with a game between two players). The question then becomes: Is mutual Cooperation more likely with iterated games? (Specifically the iterated prisoners' dilemma). As one would expect, the fact that the games are repeated (with the same players) opens up the possibility of conditional Cooperation, ie Cooperation dependent upon the past performance of the other player. 6.2 Iterated prisoners' dilemma. There are two important assumptions to be made about iterated games. Firstly, it is assumed (very plausibly) that the value of future games to a player is less than the value of the current game. The amount by which the value of future games are discounted is called the discount value, the higher the discount value the less future games are worth (Taylor 1987: 61). Secondly, it is assumed that the number of games to be played is indefinite. If the number of games is known to the players then the rational strategy will be to Defect on the last game, because they cannot be punished for this by the other. Once this is assumed by both players, the second to last game becomes in effect the last game and so on (Taylor 1987: 62). Axelrod (1984) used an ingenious method to test what would be the best strategy for an iterated prisoners' dilemma, he held two round robin computer tournaments where each different strategy (computer program) competed against each of its rivals a number of times. Surprisingly the simplest program, one called TIT FOR TAT, won both tournaments, as well as all but one of a number of hypothetical tournaments. Axelrod's results confirmed what Taylor had proven in 1976.[6] TIT FOR TAT is the strategy of choosing C for the first game and thereafter choosing whatever the other player chose last game (hereafter TIT FOR TAT will be designated strategy B, following Taylor (1987)). An equilibrium in an iterated game is defined as "a strategy vector such that no player can obtain a larger payoff using a different strategy while the other players' strategies remain the same. An equilibrium, then, is such that, if each player expects it to be the outcome, he has no incentive to use a different strategy" (Taylor 1987: 63). Put informally, an equilibrium is a pair of strategies such that any move by a player away from that strategy will not improve that player's payoff. Then mutual Cooperation will arise if B is an equilibrium, because no strategy will do better than B when playing against B. [7] The payoff for a strategy in an (indefinite) iterated prisoners' dilemma is equal to the sum of an infinite series: x/(1 - w) x = payoff w = discount parameter (1 - discount value) UD playing with UD gets a payoff of 2 per game for mutual Defection, if we set w = 0.9 then UD's payoff is: 2/(1 - 0.9) = 20 B playing with B gets a payoff of 3 per game for mutual Cooperation, thus with w = 0.9 B gets: 3/(1 - 0.9) = 30 (B, B) is an equilibrium when the payoff for B from (B, B) is higher than the payoff for UD from (UD, B): B's payoff against B is 3/(1-w) UD's payoff against B is: 4 + 2w/(1-w) Therefore UD cannot do better than B when: (3/(1 - w)) > (4 + 2w/(1 - w)) = w > (4 - 3)/(4 - 2) = w > 0.5 (Axelrod 1984: 208)[8][9] Can any other strategy fare better against B than B itself? Informally we can see that this is not possible (assuming future interactions are not too heavily discounted). For any strategy to do better than B, it must at some point Defect. But if the strategy Defects, then B will punish this Defection with a Defection of its own, which must result in the new strategy doing worse than it would have had it Cooperated. Thus no strategy can do better playing with B, than B itself. Now if B is an equilibrium, then the payoff matrix for the iterated game is: B UD B 4, 4 1, 3 UD 3, 1 2, 2 Which is an Assurance game. Thus if B is an equilibrium then we should expect mutual Cooperation. (Taylor 1987: 67) If, however, B isn't in equilibrium (ie the discount value is too high) then the payoffs resemble a prisoners' dilemma, and thus mutual defection will be the result (Taylor 1987: 67). 6.3 Iterated N-persons prisoners' dilemma. A more realistic model of (some) social interactions, especially public goods interactions, is that of an N-persons iterated prisoners' dilemma, that is an iterated prisoners' dilemma with more than two players (an indefinite number for the purposes of analysis). The analysis is too complex to reproduce here[10] but the results of the analysis of the 2-person iterated prisoners' dilemma can be applied more or less straightforwardly to the N-person case. If Cooperation is to arise at least some of the players must be conditional Cooperators (ie utilising something like B) and "it has been shown that under certain conditions the Cooperation of some or all of the players could emerge in the supergame no matter how many players there are." (Taylor 1987: 104) 6.4 Conditions for conditional cooperation. For mutual Cooperation to arise, a strategy similar to B needs to be used by individuals, and (B, B) needs to be an equilibrium. For the latter to be the case, the discount parameter needs to be sufficiently high. For the former, individuals need to be able to tell whether other individuals are Cooperating or Defecting. The discount parameter is dependent upon the chance of the player having further interactions with that player, and the frequency with which they have interactions. The greater the probable time between interactions, and the smaller the number of probable interactions, the lower the discount parameter and the lower the chance of getting mutual Cooperation. There are a number of ways in which the discount parameter can be increased (Axelrod 1984: 129-132): increasing territoriality (reducing population mobility); increasing specialisation; concentrating interactions, so that an individual has more interactions with a smaller number of individuals; decomposing interactions into more smaller interactions. If people are to employ a strategy such as B, they need to be able to monitor the behaviour of other players. Thus it seems that mutual Cooperation will be more likely in smaller societies than in larger ones. If the relations between individuals are direct and many-sided (ie, they interact with others without any mediation, and they interact with them in a number of different ways) then monitoring behaviour is much easier. This would translate into a less stringent size requirement. Such properties are to be found in societies that have the property of "community" (Taylor 1987: 105, 1982). 6.5 The evolution of TIT FOR TAT As TIT FOR TAT is the best strategy under certain conditions, we would expect that organisms that evolved in these conditions might well use this strategy as an adaptation.[footnote: With all of the usual riders such as: the variation might not have arisen; constraints of other structures might prevent this etc.] This expectation is supported by a number of apparent examples of TIT FOR TAT behaviour amongst certain organisms that do live under iterated prisoners' dilemma conditions (Dawkins 1989: 229-233). If much human social interaction does take the form of a prisoners' dilemma (and we have seen that if this is the case then these will mostly be iterated), and if we assume that much of the evolutionary history of humans and their ancestors was spent in small groups (as evidence suggests), then we might expect that humans might have evolved such a behavioural strategy. One must be wary of drawing to strong a conclusion about humans and human behaviour from evolutionary arguments. Human behaviour is notoriously complex and very plastic, unlike much animal behaviour. However I do think that this argument gives an additional reason for being optimistic about the possibility for mutual Cooperation. 7. Altruism. 7.1 Altruism is not a rare phenomenon. The purpose of the preceding section was to show that even if we grant the anti-anarchist her mot pessimistic assumptions about humans (that they are rational egoists) and social interactions (that they have the preference structure of a prisoners' dilemma) mutual Cooperation can still be achieved. I have already criticised the last assumption in S5, but the former assumption, too, is obviously flawed[footnote: This assumption is acceptable as an idealisation when we have a specific explanatory or predictive purpose in mind (presuming it does not give us bad results), but in this justificatory role its inadequacies are central to the question at hand.]. People are not egoistic. If we think for more than a few moments we should be able to come up with a number of examples of pure altruism, examples where no benefit whatsoever accrues to the performer of the action, not to mention examples of impure altruism. Donating blood is a good example of pure altruism: no (measurable) benefit accrues to someone who donates blood (without publicising it), yet the benefit to others could be great, and there is a cost (even if it is not substantial). Then there are examples such as child-rearing. The cost of rearing a child is substantial, both in terms of monetary and other resources (eg time, missed opportunities etc), yet the benefit mainly accrues to the child, not the parent. 7.2 Kin Selection. An explanation for certain kinds of apparent altruism, and possibly for a greater than expected degree of reciprocal Cooperation, can be found in the theory of Kin Selection. Taking the gene's-eye-view proposed by Dawkins (1989)[11], imagine a gene for green beards. If this gene, besides causing green beards, causes the carrier of the gene to help other individuals with green beards, it has a greater than usual chance for spreading through a population. In a normal population an organism is more likely to share genes with its relations than with another member of the population. For any gene that is in your body, there is a 50% chance that it is in the body of your sibling, there is a 25% chance, for each of your cousins, that the gene is in their body. Thus from the gene's perspective, if you sacrifice yourself to save the lives of three of your siblings, then the gene has in fact gained (because more copies of it were preserved than perished). This is the mechanism of kin selection. The closer you are related to someone, the more it benefits the unit of selection (the entity which benefits from natural selection), in this case the gene, if you aid them, with the amount of aid directly proportional to the index of relatedness (Dawkins 1989: ch 6). In game theoretic terms, in any game the payoff to the gene is equal to the utility to the individual it is in plus the utility to the other individual times their index of relatedness: The payoff in games between kin for player 1 = z + xy where: x = index relatedness. y = player 2's utility. z = player 1's utility Index of relatedness = the chance that a gene in X is present in their relation Y. For example, the value of x if the two players are siblings, is 0.5, thus the transformed prisoners' dilemma will look like: C D C 4.5, 4.5 3, 4.5 D 4.5, 3 3, 3 In this case we should expect mutual Cooperation to be the outcome, because it is an equilibrium and is preferred by both players. As we know from S6 the value of the discount parameter required for (B, B) to be an equilibrium decreases as the difference between the payoff for Defecting whilst the other player Cooperates and the payoff for mutual Cooperation decreases. Thus mutual Cooperation is easier to achieve when the mechanism of kin selection is operating. It is also possible that such a mechanism might over generalise, that is: identify too many people as being related enough to alter behaviour in prisoners' dilemma type situations. When you consider that much of our recent evolutionary history, humans have lived in small bands where the average index of relatedness was fairly high (especially compared to today), such generalisations would not have generated many false positives.[12] The mutual Cooperation engendered by kin selection can help the spread of reciprocal Cooperation. It can create a large enough cluster of conditional Cooperators to make conditional Cooperation the best strategy in the population. If a cluster of conditional Cooperators invades a population of Unconditional Defectors, once the number of conditional Cooperators reaches a certain level (dependent upon the discount parameter), the conditional Cooperators earn more than the Unconditional Defectors in virtue of their interactions with each other (Axelrod 1984: ch 3). 8. Summary I have shown that premises 2 and 3 of the intuitive/Hobbesian argument are false. Therefore the conclusions that anarchies are non-viable, and that the state is in a sense necessary, do not follow. The analysis of the iterated prisoners' dilemma shows that even if we grant the opponent of anarchy their best case, their conclusion just does not follow. Game theory shows us that even egoistic individuals will cooperate without coercion or coordination, given certain conditions. Certain conditions which are practically possible. When added to Taylor's (1982) thesis that coercion can be utilised by an anarchic community to encourage Cooperation, the plausibility of an anarchy increases. I think that the analysis from game theory and kin selection should leave us optimistic about the possibility of Cooperation without coercion, even under adverse circumstances, and thus the changes in human nature required for a viable anarchy are much less than the opponents of anarchy believe. Bibliography. Axelrod, 1984, The Evolution of Cooperation, Basic Books, Dawkins, 1989, The Selfish Gene, Oxford University Press, Oxford. Hardin, 1982, Collective Action, John Hopkins University Press, Baltimore. Hobbes, 1968, Leviathan, ed C.B. MacPherson, Pelican Classics. Lewontin et al, 1984, Not In Our Genes, Pantheon, New York. Lukes, 1974, Power: A Radical View, Macmillan Press. Mansbridge (ed), 1990, Beyond Self-Interest, University of Chicago Press, Chicago. Palfrey & Rosenthal, 1992, "Repeated Play, Cooperation and Coordination: An Experimental study", Social Science Working Paper 785, California Institute of Tecnology, Pasadena. Taylor, 1982, Community, Anarchy & Liberty, Cambridge University Press, Cambridge. ----, 1987, The Possibility of Cooperation, Cambridge University Press, Cambridge. ---- (ed), 1988, Rationality and Revolution, Cambridge University Press, Cambridge. Wright et al, 1992, Reconstructing Marxism, Verso, London. Footnotes 1: Much of this section is drawn from Taylor 1987 and Axelrod 1984. 2: Taylor (1987: 6) says that free rider problems arise only when the collective good is non-excludable but not indivisible (that is when consumption of the good by an individual results in less of the good being available to others). I don't believe that this is the case, we are surely able to free ride on the public good of parklands etc, by not paying our taxes. 3: This is really an example of an N-persons prisoners' dilemma, rather than a normal prisoners' dilemma. See Taylor 1987: ch 4. 4: Taylor 1982 can be taken as an argument against premise 4, I concur but will not go into that argument here. 5: For a full presentation of the mathematical argument for this conclusion see Taylor 1987: 39-59. 6: In his book "Anarchy and Cooperation". Taylor 1987 is a substantial revision of this book. Taylor (1987: 70) points out that he had already proven what Axelrod proved with his tournaments, however Axelrod's method was more interesting. 7: Note that unconditional Defection (UD) is an equilibrium, any strategy that Cooperates at any point with UD will score less than UD in that game. 8: B also has to do better than a strategy that alternates Cooperation with Defection, which also occurs when w > 0.5 9: Strictly speaking (B, B) being an equilibrium is a function of the relation between w and the value of the payoffs. Thus (B, B) is an equilibrium when: w > (y - x)/(y - w) or w > (y - x)/(x - z). For the payoffs I am using, this is the case if w > 0.5 10: See Taylor 1987: ch 4, for a detailed analysis of N-person iterated prisoners' dilemma. 11: This is bad philosophy of biology, but it gets the point across easily. 12: yet again this argument should not be taken too seriously, but merely adds additional reasons to be optimistic that humans are more inclined towards mutual Cooperation than is predicted by the purely egoistic model.