Core Wars Genetics: The Evolution of Predation John Perry UCLA Computer Science Department jperry@cs.ucla.edu (adapted from a paper written in January, 1991) Outline * Abstract * Introduction * Description of the Core Life Model * Methods * Results * Conclusions/Discussion * Bibliography * Definitions Abstract This paper documents a research effort in the field of artificial life. I demonstrate the evolution of predatory behavior in computer programs. The game Core Wars provides an environment for fitness testing, with evolution implemented by a genetic algorithm. I refer to the union of GAs and Core Wars as the Core Life environment. Results presented here show how random code can evolve into successful Core War programs in only a few generations. _________________________________________________________________ Introduction A primary goal of the field of Artificial Life (or A-life) is to provide alternative kinds of life to study, in order to better understand life in general. To quote Chris Langton, "We would like to build models that are so lifelike that they would cease to be models of life and become examples of life themselves." Yet in most of the computer simulations done in this field, a primary technique is to model some feature of a biological life form, like an ant or a crayfish Some of these experiments generate useful results, like being able to evolve trail following behavior (the ants) or confirming theories of pre-adaptation (the crayfish), but they skirt the goal of providing alternative types of life. The problem is that ants and crayfish live in a world (the biosphere) that is very different from the world in which the simulations are conducted. For example, ants follow trails of pheromones, though computers and programs have no sense of smell. Likewise, an experiment that confirms our beliefs about neural patterns involved in the tail-flip of a crayfish is interesting, though not entirely useful, since computers have no tail to flip. If we truly wish to study models of computer life, it would behoove us to limit our studies to phenomena which actually occur within a computing environment. The domain of binary biology (henceforth, binology) has enough interesting properties that there is no real need to go outside of its bounds to provide thought provoking A-life experiments. Processes are born, move, reproduce, communicate, die, and do many other tasks normally associated with biological life forms, so we needn't impose features on them that they don't really possess. Computer viruses and worms demonstrate the abilty of programmed processes to assume characteristics of biological life. In fact, these programs are sufficiently lifelike that most computer users take precautions against them, just as they would against a biological parasite. This raises an important issue in the field of binology. If we are to study processes which have the potential to run amok (and unexpected behavior is something we often hope for), we must be careful not to affect other processes running on the same machines as binological experiments. Regardless of his intent, Robert T. Morris, Jr. suffered the consequences of allowing an experiment (the infamous Internet Worm) to escape his control. A reasonable solution is to run binological experiments within a controlled virtual environment which is representative of real computational environments. Core Life is one such environment. _________________________________________________________________ Description of the Core Life Model Core Life is a marriage of two well-founded paradigms, Core War and Genetic Algorithms. Core War, which was introduced by A.K. Dewdney in his Computer Recreations column, is at heart a game for computer programmers. It consists of two distinct levels of software. One level is an emulator of the hypothetical Core War computer. The language of this faux machine is a basic assembly language called Redcode (see Figure 1). The software which emulates this computer is referred to as a MARS, for Memory Array Redcode Simulator. Figure 1: The Redcode instruction set Instructions DAT A, B - a data value (non-executable) MOV A, B - move data from A to B ADD A, B - add A to B and store in the B location SUB A, B - subtract A from B and store in the B location CMP A, B - if A is equal to B, then skip the next instruction SLT A, B - if A is less than B, then skip the next instruction JMP A, B - unconditional jump to A JMZ A, B - jump to A if B is equal to zero JMN A, B - jump to A if B is not equal to zero SPL A, B - split into two processes (at A and the next instruction) DJN A, B - decrement B, then jump to A if B is not equal to 0 Addressing Modes # - immediate $ - direct @ - indirect < - indirect with predecrement The other level of software consists of Redcode programs written by the player-programmers. Programs at this level are generally called warriors, but this term is more appropriately applied to the processes caused by the execution of these programs. The game consists of placing the code for two (or possibly more) warriors in a MARS, and executing them simultaneously. Warrior performance is measured in terms of computational longevity, so the dual goal of a warrior in a game of Core War is to continue to execute while at the same time causing opponent warriors to cease execution. Thus in Core War, player-programmers vicariously test their skills against each other through the behavior of their programs. The behavior of warriors can actually be quite complex, and their strategies quite varied. Warriors can see the entire MARS (since the range of the data fields of the instruction set is generally larger than the size of the MARS), so it is possible to look for an opponent rather than just bomb blindly. Warriors can make copies of their code and jump to the copy (thus affecting movement), or split to the copy (thus achieving something like cell division). By splitting one or more times, a warrior can divide its processing into cooperative tasks. The best warriors in Core Wars tournaments have displayed such complex behaviors as self-repair, mimicry, and trap-setting, all of which are predatory strategies used by biological animals. The analogy between warriors and biological life forms is fairly transparent. The code for a warrior is it's genotype, and the process created by executing that code in a MARS is its phenotype. Biological life forms survive by successfully competing for food and other resources, much as warriors survive by successfully competing for computational resources. This analogy isn't hampered by the inclusion of properties which aren't computationally feasible. Redcode is a basic assembly language, consisting of a small set of instructions and memory access modes. All data references are to memory locations, and the total size of the memory is quite restricted. The hypothetical Core War computer is thus a simple machine which is not unlike existing computers. It is reasonable to assume that when binological life forms evolve naturally (i.e. without intentional human intervention), their evolution will take place in an environment quite like that emulated by a MARS. While Core Wars provides the environment in which warriors compete, a genetic algorithm is used to reproduce successful warriors and keep tabs on how the population as a whole is progressing. The goal of my experiments is to start with random code and evolve complex behavior. In this way I can study both the nature of evolutionary change and the form of binological behavior which is likely to evolve. A major distinction between my use of Core Wars and that of other A-Life researchers is in this view of the MARS as only a part of the binological environment. Other experiments concentrate on Core Wars as the complete environment, and more closely resemble primordial soup experiments than population genetics experiments Due to the diverse behavior that warriors are capable of, there were many possible complex behaviors for us to choose to attempt to evolve. I chose predation because it is perhaps the most complex behavior which warriors are capable of. In fact, good warriors usually incorporate some other tasks like survival or movement as part of their predatory scheme. Another reason for the choice of predation is the fact that it is easy to fitness test for. The concept of a battle between two warriors is well defined, and easy to score for fitness testing. In tournament play, warriors are scored three points for a win, 1 point for a tie, and no points for a loss. I used the same scoring for fitness testing in my experiments. Testing for movement would be slightly harder since the evolutionary steps towards that behavior are probably subtler. Another benefit of evolving predation is that the evolved warriors can be entered into the international tournament, thus providing an ultimate test of their fitness. _________________________________________________________________ Methods I actually used two methods to fitness test warriors. In the initial experiments, all of the warriors of a population battled against common opponents. These opponents were warriors (written by humans) which had been entered in previous Core Wars tournaments. Both successful and unsuccessful warriors were chosen so that progress might be seen in the form of the genetic warriors defeating better and better opponents. Each warrior in a generation fought the same number of battles against the same opponents, but with random starting locations for each battle. In this way average warriors could still score highly if they happened to get advantageous starting positions, and thus allowed to reproduce. (I felt that this normalization factor emulates the real world very well, since the very strong and the very smart aren't always the most successful organisms) This method of fitness testing turned out to be very successful, but perhaps unrealistic since the opponents provided a sort of evolutionary goal. This is not to say that biological life forms never deal with man-made antagonists, but rather that the ideal competition for evolving life forms are its sibling life forms. Thus I derived the other mode of fitness testing, whereby each warrior battled against its neighboring warriors. This eliminated the artificial goal, although there were still the inherent goals of survival and reproduction. Since success in the MARS was tied to reproduction, I predicted that I would still see evolutionary progress toward predatory behavior, but at a slower rate, since the motivating factors weren't quite as overt. For clarity, I will refer to these sets of experiments as phase~1 (human-generated opponents) and phase~2 (sibling opponents). Most of the methods were common between the two phases. In my experiments, genetic manipulation was done at two levels: crossover at the instruction level, and mutation at the bit level. Mutation was done at the bit level to assure that the entire space of possible warrior programs was reachable. Crossover is a much more frequent operation, however, and Redcode assembly language isn't particularly robust (for example it has a 4 bit opcode field, but only 10 legal instructions). I felt that if crossover was done at the bit level, there would be too high a percentage of non-functional warriors in each generation. By performing crossover at the instruction level, it was still possible to produce non-functional warriors with this genetic operator, but not invalid instructions. In a way, instructions make more sense as genes than bits, since they carry more information. In each generation, the entire population was replaced by reproducing with a genetic algorithm. A biased roulette wheel method was used to select parents from the champions of the previous generation, with each warrior given a percent chance of reproduction corresponding to its total fitness score. In phase~1, each mating of two warriors produced two progeny in the following manner: 1. First, instructions are copied one-by-one from one of the parents to one of the children. As each instruction is copied, there is a random chance (.05\%) that mutation is performed. Mutation consists of flipping one of the 40 bits in the instruction (randomly selected). After each instruction is copied there is a random chance (25\%) of a crossover. 2. After the first crossover occurs, instructions are copied from the other parent to the other child. Again, each instruction is tested for mutation and after each instruction is copied there is a random chance of crossover. 3. After the next crossover, instructions are copied from the second parent to the first child, one-by-one, with mutation and crossover tested for as before. 4. Next, instructions are copied from the first parent to the second child, with mutation and crossover tested for. When crossover is chosen at this point, it goes back to the first parent being copied to the first child. 5. This process continues until all of the instructions from both of the parents are copied to the children. Finally, the lengths of the two children are determined and a random starting address is selected for each. This method allowed multiple crossovers at random locations in the code. Since the maximum warrior length I allowed was 64 instructions, I would expect an everage of about 64 x .25 = 16 crossovers per mating in the longest warriors. The reason I used such a high crossover rate is strictly computational. Due to space limitations, my population sizes were 1000 warriors per generation for phase 1, and 900 per generation for phase 2. Since the maximum warrior size for these experiments was 64 instructions, and Redcode machine instructions are 40 bits long, the maximum number of bits per warrior was 2560. This means the size of the space being searched is 2^2560 (about 4.33 x 10^769), which makes the population size seem pitifully small. I wanted something to stir up the evolutionary process, and I felt it was worth sacrificing genetic stability if it enabled faster evolutionary progress. Theoretically, I could reduce the rate and run the experiments for much longer and get similar results. For a more interesting experiment, I could effect a sort of `evolutionary annealing' by starting with high crossover and mutation rates and reducing them over time. On a similar note, my method of crossover is very messy, since positions of crossover and the lengths of the child processes can vary widely. From a probabilistic sense however, this is no great drawback, since it would predict that the warriors would stay in a moderate range of length and would each get some genes from both parents. In fact, it may have some benefit, since it tends to distribute the instructions that are present throughout their possible positions in the warriors. A caveat is that there is very little genetic stability, since even identical parents could produce very different children. Populations for phase 2 experiments consisted of 900 warriors, arranged in a 30 by 30 array. The neighborhood for a warrior was defined as the 8 warriors immediately surrounding it in the array. For example, warrior (i,j)'s neighborhood looks like this: ------------------------------------------------- | | | | | (i-1,j-1) | (i-1,j) | (i-1,j+1) | | | | | ------------------------------------------------- | | | | | (i,j-1) | (i,j) | (i,j+1) | | | | | ------------------------------------------------- | | | | | (i+1,j-1) | (i+1,j) | (i+1,j+1) | | | | | ------------------------------------------------- The edges of the space wrap around in a torus, so warrior (1,15) has warriors (30,14), (30,15), and (30,16) as three of its neighbors. Reproduction for phase 2 experiments uses the same basic algorithm, but only one child is produced from each pairing, and the code that would have been the other child is thrown away. The first parent to be copied from is selected randomly. Prospective parents for child (i,j) are chosen from the warriors in the neighborhood of (i,j) from the previous generation, rather than from the whole population. This means that there is no global homunculus, but rather all interaction is local. _________________________________________________________________ Results In each phase, I conducted several experiments of 10 to 15 generations. Results of a typical run of a phase 1 experiment are presented in table 1. These experiments indicate that it is possible to evolve predatory strategies, even in small populations and in a small number of generations. The data shows that the number of wins and ties by a whole population of warriors generally increase with each generation. The ratio of wins-per-battle-fought is proof that the predatory skills of each generation is better than the last. In some cases, the number of wins increased at the cost of the number of ties. This would seem to indicate that survival and predation are competing strategies in Core Life, which is realistic since any sort of attacking behavior has a chance of backfiring on the attacker, be it biological or binological. A related issue in biology is the fight or flight response. Table 1: Results from one phase I experiment. (WANG1 and QUARTER2 were two of the human coded programs used as opponents for fitness testing. The other 3 columns refer to battles fought against all opponents by a particular generation) wins vs. wins vs. wins per ties per multi-battle Generation QUARTER2 WANG1 battle fought battle fought winners ---------- -------- -------- ------------- ------------- ------------ 0 2 3 0.2 % 4.65 % 1 1 19 10 1.83 % 5.1 % 9 2 59 5 4.86 % 8.65 % 41 3 179 16 6.92 % 6.02 % 76 4 258 19 11.2 % 8.68 % 170 5 332 28 12.3 % 5.2 % 195 6 401 28 17.6 % 4.92 % 292 7 420 22 8.5 % 12.3 % 148 8 480 23 20.96 % 5.38 % 363 9 451 16 18.66 % 9.72 % 300 10 549 24 21.1 % 4.88 % 385 The fitness testing for phase 1 experiments was done using a commercial MARS which graphically displayed the battles during fitness testing. From observing these battles, it was apparent that a sort of speciation occurred. Warriors evolved a variety of strategies for survival, including bombing through memory with DAT (or some other) instructions, infinitely looping on a single instruction (this was more a survival strategy than a predatory strategy), and splitting to arbitrary locations in the arena in an attempt to branch into the opponent warrior's code (a form of mimicry). One complex strategy which evolved was a species which would bomb with imps (an imp consists of the single instruction `MOV $0, $1'. This instruction, if executed by a process, will copy itself to the next location in memory. Then on the next clock cycle, the process will execute that copy and move itself ahead again, and so on.}, and then kill the imps when they reached its code. The fact that a cooperative process like this could evolve in so few generations is fascinating. Unfortunately, they were also extinct in a few generations, either because of the strong evolutionary force of the high crossover rate, or because their strategy, although elegant, wasn't effective. Two second generation warriors of a pilot version of this project were entered in the 1989 International Core Wars Tournament, and they came in second to last and third to last (the last place finisher had a bug in his program). Two of the best warriors (one sixth generation and one tenth generation) from an early run of phase 1 were entered in the 1990 ICWT and did considerably better, finishing 12th and 16th out of 26 finalists. The fact that these warriors were able to beat human- coded opponents is impressive, the additional fact that they did so in only a few generations is suggestive. A properly controlled experiment which ran for hundreds, or even thousands of generations, should stand a good chance of winning the tournament. Granted, this is not nearly on a par with beating Karpov at chess, but still it would be an impressive notch on the genetic algorithm's belt. The first runs of phase 2 experiments are currently being generated and analyzed. Since they weren't fitness tested against human-coded warriors as in phase 1, the fitness scores aren't as valid an indicator of their progress. As an initial analysis technique, I've taken some of the champion warriors from each generation and run them against some of the same warriors that were used in fitness testing in phase 1. Unfortunately, I didn't get this data ready in time for presentation in this initial report, but I can comment on observed results. The progress isn't nearly as steady as in the phase 1 experiments. Two factors could account for this. First, all phase 2 warriors which get any fitness score other than 0 have a chance to reproduce, unlike phase~1 where only the top 10% of the warriors were allowed to reproduce. Second, the data from phase 1 accounts for all of the warriors in each generation, whereas the scores from phase 2 only account for a few warriors which happened to get the highest fitness scores. Thus, in phase 1, there were more chances for mediocre warriors to win or tie through advantageous starting locations. It is important to note that despite these handicaps, some survival and predation strategies still evolved. I feel that this supports the validity of my conclusion that binological predation is an evolvable phenomenon. I am hoping that warriors of future generations of phase 2 experiments will reach (and hopefully surpass) the complexity level of phase 1 warriors. Two pieces of data which I am collecting in phase 2 experiments which I ignored in phase 1 experiments are the fitness scores for each warrior and the average battle length for each warrior. I feel that this data may be useful in analyzing the dynamics of evolutionary change over the whole population of warriors. For example, GAs have a tendency to find stable points, where much of the population becomes fairly homogeneous. At this point, I would expect a lot of ties, so the fitness scores should be lower, and more evenly spread out. When one warrior gains some new ability, through either mutation or crossover, I would expect its score to be much higher for a few generations until it could spread this feature through reproduction. Next, I would expect this new feature to slowly ripple throughout the space until it was fully distributed. The interesting point to look for is initial spike in the fitness landscape, as this indicates where the productive mutation came about. _________________________________________________________________ Conclusions/Discussion There are a wide range of possibilities for further experiments on this subject. Currently, I am working on the suggested fitness landscape maps, and also planning to do several other experiments. The obvious ones that come to mind are variations in the basic parameters of the system like mutation and crossover rates, size of population, number of generations, neighborhood structure, and methods of generating the initial random warriors. A more ambitious project would be to graph family trees of warriors, thus tracing their genetic progress. A slightly further deviation from this project would be to evolve programs for other tasks and in other instruction sets, and in fact this has been done (more or less), but not on tasks that are distinctly binological in nature. _________________________________________________________________ Bibliography William R. Buckley A Core War Model of Computational Evolution, International Society for the Systems Sciences, Proceedings of the 34th Annual Meeting, Portland, Oregon, July 8-13, 1990. Collins, Robert J. Tracker. Forthcoming in the Proceedings of the Second Workshop on Artificial Life, Santa Fe, 1990. Dawkins, Richard The Selfish Gene. Oxford University Press, Oxford, 1976. David J. Depew & Bruce H Weber (eds.) Evolution at a Crossroads. The MIT Press, Cambridge, 1985. Dewdney, A.K. In the Game Called Core War Hostile Programs Engage in a Battle of Bits. From Scientific American, May, 1984. Goldberg, David Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, New York, 1989. Holland, John H. Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor, 1975. Langton, Chris Studying Artificial Life with Cellular Automata. From Physica D, pp 120-149, 1987. Perry, John R. Predatory Programming. From The Core Wars Newsletter, Fall, 1988. Rasmussen, Steen The Coreworld: Emergence and Evolution of Cooperative Structures in a Computational Chemistry. From Proceedings of the Second Workshop on Artificial Life, Santa Fe, 1990. Stork, David G. Non-Optimality via Preadaptation in Simple Neural Systems. From Proceedings of the Second Workshop on Artificial Life, Santa Fe, 1990. ??? Communications of the ACM Vol 32, Num 6, June 1989 The Core War Standard of 1988 International Core Wars Society, 1988. _________________________________________________________________ Definitions Binology The study of computer-based complex systems which demonstrate lifelike characteristics. Computer Virus A computer virus is a program which effects its distribution by attaching itself to the executable code of another program, called a vector program.