Let’s back up a step. First of all, what do we want in the user experience?
1. We want an engaging game experience. Draw in the user quickly, and then keep them enthralled. This means things need to evolve rapidly into something interesting.
2. A rich environment. With all we are planning, this should be no problem. I see the main problem being the balance between functionality and complexity. We don’t want to bog the game down to the point that it is slow and uncooperative. I think we can achieve this delicate balance through abstraction, while making it richer at the same time.
3. Evolutionary Computation. We want this to work. We want realism. That’s where the entertainment comes in. We want letters to be floating around in Letter Land, colliding with each other, spawning populations of new letters, mutating, reproducing, overcoming obstacles, eating, living, and fighting for survival as realistically as possible.
Now the details, as I see them. To make all of this work, we need three layers (so far):
1. The Cellular Layer – This is dealing with the letters on a cellular scale. Binary bits “alleles” make up the chromosomes (strings) and ultimately letter genomes (2d arrays). The letter library, the comparator and the schemata all operate at this level, being driven by our Genetic Algorithm. Keep in mind, the body cannot live without its cells. This is where the cells are operating on a micro level. The user can only see this by clicking to view a detailed history of any given genome (probably an XML or text file). This is where we’ve been doing all our work so far, completing Phase I of the project. So far, so good.
2. The Physics Layer – This is dealing with the letters on a population scale. The Physics Layer is special because this is where the abstraction exists. The letters take on a new form as physics bodies governed by physical laws and shapes / geometries instead of genomes full of binary chromosomes. The Physics Layer contains the full map of Letter Land. Herein lies the core of the visible evolutionary processes (Genetic Programming, Neural Networks? ). This is where we are at now (phase II), and this is where the problems crop up.
3. The Window Layer – This is dealing with the letters on a body scale. The window is a “chunk” of the globe of letter land, where the user is zooming in on the action. Essentially, this window or magnifying glass is the user interface to Alphabet Soup. Any user interaction we decide to include (implementing new obstacles or food, for instance) will happen here.
DM = Decision Maker; FFGP = Fitness Function Genetic Program; GA = Genetic Algorithm; EGP = Environmental Genetic Program
With all that in mind, let’s break down our problems, keeping the “functionality versus complexity versus realism” issue at the forefront.
Problem 1: Decision-making. Two letters, a C and an R are about to run into each other. What happens? The sensing organs for both letters realize a collision is imminent, and call on what we’ve proposed as the DGP (Decision-making Genetic Program). Its options are simple: Breed, Flee, or Fight.
– Whenever Breed is chosen, the GA at the Cellular Level is called and run. This has a direct impact on the appearance (therefore, geometry and physics) of the letter at the Physics/Window Layer.
– Whenever Flee is selected, a given amount of thrust is applied to the letter to get it away from the opposing letter.
– When Fight is selected, the letter will use whatever is at its disposal to try to kill (?) the other letter. Think: war over territories.
The DGP is only used for this about-to-collide decision-making procedure. I figured on using the EGP (discussed next) to make any other health/fitness related decisions at the Physics Layer. The reason I chose GP for this decision scenario is because I was thinking of the ant-finding-food problem where its only options were RIGHT, LEFT, FOOD, which could actually just as easily be solved with a GA. Here’s the thing, this “Breed, Flee, Fight” scenario is even simpler than that, which begs the question: Does this step need to involve Genetic Programming at all? It is a simple choice between three options, and it is very localized to the exact moment when the sensing organ calls on it.
So my argument is to do away with the DGP, instead allowing this decision to be completely (wince) random, or something else entirely—such as Fuzzy Logic or Neural Networking—although this would involve a different approach taking into account all of the decision at one time. The only way I see GP being beneficial here is if it were somehow tied into how the decision affects the population as a whole. But here’s my take on that: we do want complexity, yes, but I feel using a GP here is unnecessary and overly complicated, when we could be using EGP instead to be causing the complex, exciting evolution to take place in the Physics Layer.
But I have a better idea yet. One way to keep it from being random but still not requiring too much processing is to give it a set of circumstances, such as: “if R population is greater than 100 and R population is declining, avoid, else attack.” Some simple lines of code like that will cause the about-to-collide letter to make fast, simple, but smart decisions. I think this is the ideal way to do it.
To use the if/then approach, the Decision Maker must have access to information about the current stats of the population, yes, but they do not have to be drawn from the EGP. Instead, I propose we have a pool of stats about the entire population of letters through the use of a number of variables, then the various functions (EGP, Decision-maker, and so on) access the data and use it as they see fit. This object oriented approach modularizes the information and makes the most sense to me.
Any of these ideas would allow these simple, but constantly-taking-place collision decisions to not use so much processing power unnecessarily, thus solving the “too slow” problem. And realism is maintained, by the simple implementation of the if/then statements, and more thoroughly through the use of the EGP (problem 2).
Problem 2: How do we make the Environmental Genetic Program (EGP) function? Well, I think this will play itself out. As long as we keep the pool of variables available to any function, the key to a good EGP will be to assign it a good fitness function determinable by the variables it accesses. This will likely include the physical attributes of the individuals as well as the population as a whole. I think the EGP will be the core to making the game an enjoyable experience, so we need to do it right and think it through. I think we should keep the other facets of EC relatively simple throughout the program, and focus all of the complexity here. And no, we are not there yet.
Problem 3: How are we going to tie the layers together? I saved this one for last because to me it is the most complicated problem to figure out. We have a vague idea for the FFGP (Fitness Function GP), which will work as follows:
Letter ‘C’ has to get around a flying asteroid, but its friction causes it to slow down to the point where the asteroid is going to collide and destroy it. It needs to evolve to be able to evade, so it grows a wall on its right side, thus becoming an ‘O’. This in itself is a problem: How do its genes, being abstractly separate from the physical body, know that the letter is now an ‘O’ and no longer a ‘C’? We’ve proposed using the FFGP to evolve a fitness function at the GA (Cellular) Layer for the GA to evolve to meet the needs of the Physics Layer. This is all still blurry to me.
I think it would be logical that if the C evolves to an O in the Physics realm, this information should be sent to the Cellular Level, matched up to the schemata, and transformed to an O. To me this is still sufficiently complex and realistic, as the EGP at the Physics Level will still be performing the evolution to the different letter, while requiring the GA to “catch up.” The two layers have to work together somehow, otherwise what’s the point of having both layers? Or maybe the Cellular Level can have access to the same data as the Physics Level, and read from the EGP to determine what is going on and how to evolve, but then where is the abstraction—and that all seems too process intensive anyway. This is a point of contention.
The other tie-in between layers is the thread between the Physics Layer and the Window Layer (User Interface). This involves the physics processes constantly going on behind the scenes at the Physics Layer, while using the Window Layer to just focus in on wherever the user wants to look in the population. We will likely not even get to this problem in phase II because we will be starting with a small enough population to watch it play out on a single screen.
These are all good problems because it shows we are making progress. We should start simple at each layer, then add layers of complexity as we get the pieces to work. Next I want to focus on getting a working Physics Layer with Blend and Physics Helper 3. Once we do that, we can deal with the problem combining it all with the Cellular Layer.
What we have so far:
Having a good definition of fitness is going to be crucial to our success with Alphabet Soup, and any of our games for that matter. So far we have only one fitness factor, a rather rudimentary “meatiness” goal: the more 1s, the better.
Aiming for the maximum fitness of 1, we define an ideally fit chromosome of 16 binary “alleles” as 16/16 =1, or:
chromosome = 1111111111111111 =
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 = 16/16 = 1 = 100% fit.
Anything less than ideal is given a fitness value such as 13/16 = 0.81, or:
chromosome = 1111110111011101 =
1+1+1+1+1+1+0+1+1+1+0+1+1+1+0+1 = 13/16 = 0.81 = 81% fit.
This is all very simple. Too simple. But it’s a start. The question is, where do we go from here?
The next step will be to include the schemata for the letters. Fitness will be a combined value of the “meatiness” and the “likeness” defined by a comparator function. Here is how it will all work:
First we will have the letter genomes in the Binary Letter Library broken down and defined into the final schemata:
charLine = 0*****0000000000
lines 1 thru 14
Where there is a “0”, the value is always zero–in any letter in our population. Why calculate that if it is never going to change? The remaining values then, represented by asterisks, will be either 1s or 0s, depending on the letter. The results of the GA generations will be run through the schemata, with the remaining asterisked positions matched up against the letters in the library. Matching to the nearest letter will be easy. Say there are 125 asterisked positions. After the first generation of GA takes place, the fittest-so-far (meatiest individual) is passed through the schemata. All zero-only spots (such as charLine, charLine are dropped immediately. All asterisk “either-or” positions are evaluated. If charLine is a ‘1’, it is matched to all letters with a value of ‘1’, and so on. The genome will then take the shape of the letter with the most matching values….
An important technical aspect of this that we have agreed on is to have the comparator only change the way the letters are displayed, while leaving their array intact–aside from the schemata. In other words, the original GA-defined chromosome/genome is maintained, while a copy of itself is stripped to be matched against the schemata. This allows it to maintain its evolutionary integrity with a level of abstraction so more fitness functions can be performed. I will deal more with this later.
So how do we define the new fitness value, with both the schemata (likeness) and meatiness in mind? Here is a 5-step process that should do the job:
1. Take value of current generation individual. Let’s use for genInd, say: 0000101011101110
2. Pass through schemata. Let’s use generic letter line 5, which might be 00*****0******00, making genInd now equal 0000101011101100
(the only bit that changes is the second-to-last one, because based on the schema that bit always equals 0).
3. Define “likeness value.” For simplicity’s sake let’s use charAline which equals 0000010001000000. For all bits of genInd that match charAline, add a 1:
So, comparing 0000101011101100 (genInd) to 0000010001000000 (charAline), and disregarding schemata characters that are always 0 (blue), we find only 4 bits that match (red), giving genInd the “likeness” value of 4 as compared to charAline.
4. Define the “meatiness value.” To incorporate the “meatiness” factor, let’s add an additional +1 to each value of matching letter that is, actually, “1” (meat). When we add up all the matching value of 1s between genInd and charAline, we only get +1: 0000101011101100 compared to 0000010001000000. So, for this particular example, the meatiness value = 1.
5. Add “likeness value” and “meatiness value”.To get the total fitness value, add the likeness value (4) to the meatiness value (1) for a total value of 5.
Note: since this schemata line has 5 characters that are always equal to 0 (shown here: 00*****0******00), they are subtracted from the maximum fitness amount. So, the maximum fitness of this line is 16-5 = 11 possible values that match (“likeness” value), plus the possibility that all of those values are 1s (“meatiness” value), making the total maximum fitness of this line equal to:
11 (max “likeness”) + 11 (max “meatiness”) = 22. So in our example,
4 (likeness) + 1 (meatiness) = 5 was not all that great (5/22 = 23%). Other letters in the alphabet will much likely have higher fitness on this line.
I think this will be a good way to determine total fitness–at least in early versions of Alphabet Soup.
All technicalities aside, this is where Alphabet Soup can get really interesting. This will probably be around… version 3.0 or so. Right now, these are just vague ideas.
Fitness based on letter shape was the original goal of Alphabet Soup all along:
“Hark! Here comes a second generation character, ‘i’, who has lost the vestigial tittle (the dot). Most other letters are unchanged or changed in a negative or neutral way. The lower mass of the ‘i’ makes it faster and take in less calories. Alas, it’s lower weight makes it do less damage and is losing ground to the dastardly ‘T’s and their hearty top most shield. Will the character genes be favorable enough to carry on or will they be lost and another random variation will make itself dominant. At the front of ‘i’s’ and ‘T’s’ war, a fork occurs with the ‘w’s. The ‘w’s strike fast coming down but their double point distributes the impacts, lessoning the effect on harder shelled letters like the ‘T’.”
With this, things like food, environment and the ability to reproduce/fight/defend are all important.
Another idea is this: fitness is dynamic, determined by the players. They change the landscape–by adding obstacles, giving new goals (such as winning a race, building a tower, et cetera). Different letters will be more suited to different tasks, so fitness will vary from task to task. This might keep it more interesting than if fitness was just based on one thing (like meatiness or shape). Also, on the note of shape mattering–maybe shape won’t matter as much as far as fitness value is concerned, however shape may be important for movement in Farseer if we can get the physics of it right. So… if the goal is to get over a mountain or something, the shape and physics will be important to the letter’s success, indirectly having an effect on the fitness anyway.
A good comparison is to think of how to keep a Sim alive: by balancing a variety of values that are important to the character’s success at staying alive, such as social value, food, health, activity, recreation, knowledge, and so on. Any of these kinds of ideas could be incorporated into the fitness value to make it more interesting and keep the player more involved in what is happening in the Soup. One important key to this is to allow the Evolutionary Computation to do its thing without interferring. If we can hone that delicate balance between the EC and the player-letter interaction, Alphabet Soup could be great.
The future awesomeness of Alphabet Soup depends on our ability to come up with more fitness ideas, and to make them actually work.
We have talked a lot about including sensing organs for letters to be able to interact with their environment on a biological level. Also in discussion are color values which may or may not be related to fitness, sensing ability, or overall health of a letter. We will have to deal more with these ideas and see what we can agree on.
One thing we have reached consensus on is the level of abstraction that will be present in our final product. Overall, we are going to let the GA and GP do their things in a very precise way, keeping the Evolutionary Computation aspect as pure as possible–but this will all be going on behind the scenes. The precision in the background will be modified and “rounded” (such as with the schemata) before reaching the eyes of the player. This will allow the letters to maintain their forms, allow the Physics Engine to operate on fixed shapes, and so on. However there will likely be a more abstract level of EC going on as well, such as with the shapes present in the physics engine and how they react to one another. The more levels of EC operating within the game, the more rich the environment and the more capable it will be of surprising us.
But we don’t want the abstraction to completely cover up what is going on behind the scenes.
One way to be able to unite the abstraction with the player’s awareness of what is going on is through keeping a detailed history of each genome in the population. We should somehow tie the chromosomes to their respective genome so that the genome holds its identity and evolutionary lifespan throughout its own history. Maybe stamp each chromosome with some sort of hidden i.d. assigning it to its current genome. These generational histories will provide the owner of the letter with information showing how old the letter is (how many generations it has survived) and each detailed transformation it has undergone during its evolution to its current state. This could be useful for farming, trading or even–who knows–combat? Dog-fighting is illegal, but how about pitting our letters against each other? There are all kinds of scenarios and places we could go with Alphabet Soup. And as the creators of Dwarf Fortress believe, the more ideas we integrate, the better the game might become.
(The following is a synopsis of chapters 5 and 6 of John R. Koza’s “Genetic Programming”)
Genetic Programming is derived from Genetic Algorithms, and in many ways involves the same process of evolutionary natural selection. The core difference between Genetic Algorithms and Genetic Programming is the program structure. Genetic Algorithms (GAs) operate on fixed length binary strings representing a biological chromosome, while Genetic Programming (GP) instead operates on dynamic hierarchical programs. While in GA, the size and shape of a binary string are fixed, GP programs vary in size and shape. This makes GP programs more complex and, in many ways, more powerful.
The more complex and dynamic program structure of GP results in a cascade of other differences which separate it from GA. Because of this, some problems may be better suited to one method or the other. Generally the more complex the problem or the larger the search space for a solution, the more likely the problem should be solved by GP.
GP solves problems by breeding computer programs, then using reproduction and crossover in iterations through generations of programs, until a solution to the problem is found or some other set parameter is satisfied (such as number of generations run).
Koza’s GP, Figure 5.1
Koza references Holland’s (1975) list of key features of all adaptive systems. For GP to operate successfully, the following must be defined:
· Initial Structures
· Fitness Measures
· Genetic Processes
· Program Memory
· Termination Criterion
· Designating a result
GP programs are best represented by expression trees, which are hierarchies containing functions and terminals.
The above diagram represents the program string (in pseudocode):
Any of the functions if, <, then, else or terminals x, 4, t, r will be subject to change as GP progresses through generations.
It is important to select a sufficient set of primitive functions for a problem. For instance, with an ant trying to maneuver through a maze for food, MOVE, RIGHT, and LEFT are sufficient. Anything less is insufficient, while anything more may or may not be useful.
Keep in mind the set of functions and terminals must be sufficient, but extraneous functions and terminals will often degrade performance.
After the sufficient pool of primitive functions has been determined, the initial population (generation 0) is generated with expression trees containing random combinations of these functions and terminals.
A uniform random probability distribution is responsible for arranging the the expression trees within the initial population, using only the sufficient function set determined ahead of time.
There are three types of expression trees (pp. 92-93):
· Full (every branch of the tree is equal length)
· Grow (variably shaped trees bound only by maximum branch length)
· Ramped half-and-half (combination of full and grow)
Koza prefers to use the ramped half-and-half method for all of his GP demonstrations.
Defining fitness is important because fitness directly affects the primary GP operations of reproduction and crossover. There are many ways to accomplish this.
“The most common approach to measuring fitness is to create an explicit fitness measure for each individual in the population… Each individual in a population is assigned a scalar fitness value by means of some well-defined explicit evaluative procedure” (p. 94).
Koza describes four kinds of explicit fitness measures used in GP programs:
1. Raw Fitness (The “measurement of fitness that is stated in the natural terminology of the problem itself” [p. 95].)
2. Standardized Fitness (The lower the number the better. If the raw fitness is also striving for the lowest number, standardized = raw. Otherwise standardized is the inverse of raw.)
3. Adjusted Fitness (Computed from the standard fitness, adjusted fitness is always between 0 and 1. The bigger the adjusted fitness, the more fit the individual.)
4. Normalized Fitness (Computed from adjusted fitness, the sum of normalized fitness values equals 1.)
The genetic processes in GP consist of two primary operations: Reproduction and Crossover.
Reproduction in GP is asexual (one parent, one offspring).
“The operation of reproduction consists of two steps. First, a single [program] is selected from the population according to some selection method based on fitness. Second, the selected individual is copied, without alteration, from the current population into the new population (i.e., the new generation)” (p. 99).
Three methods of reproduction include:
· Fitness-proportionate selection (The most popular method–uses the fitness measure to determine how many of which programs are allowed to reproduce)
· Tournament selection
· Rank selection
Crossover in GP is sexual recombination (two parents, two offspring). It is, essentially, a swapping of subtrees.
Which parents are chosen for crossover is determined by the fitness-based selection method.
Random crossover points are chosen (again determined by using a uniform probability distribution) from each parent. Everything in the subtree below the crossover point is swapped with the subtree underneath the crossover point in the other parent.
Swapping of simple terminals (instead of entire subtrees) is common in GP, and this is the same as point mutation in GA–so mutation is inherent in GP crossover.
Another Crossover difference between GA and GP: In GP the two crossover points in the two parents are usually different; in GA the two points are always identical because crossover is operating on a fixed-length character string. This leads to better capacity for variety in GP:
“In genetic programming, when an individual incestuously mates with itself (or copies of itself), the two resulting offspring will, in general, be different (except in the relatively infrequent case when the crossover points are the same). As before, the Darwinian reproduction operation creates a tendency toward convergence; however, in genetic programming, the crossover operation exerts a counterbalancing pressure away from convergence. Thus, convergence of the population is unlikely in genetic programming” (p. 104).
Also important for GP crossover is the maximum permissible size, which is measured by the depth of the tree. Regulating the maximum depth keeps crossover from creating oversized trees and saves computing time.
There are also secondary operations in GP, all of which are optional and are probably best reserved for specific situations:
· Mutation (Asexual changing of an individual terminal, but occurs occasionally through sexual crossover in GP anyway. Useful in moderation in GAs, but really not necessary in most GP applications.)
· Permutation (Asexual inversion–swapping terminals of a function.)
· Editing (Follows editing rules such as shortening “(NOT ( NOT X))” to “X.” This can be used for cosmetic purposes and for simplification or improvement of GP output.)
· Encapsulation (Protection from crossover for ideal offspring subtrees, encapsulation offers “a means for automatically identifying a potentially useful subtree and giving it a name so that it can be referenced and used later” [p. 110].)
· Decimation (Deleting a portion of population, controlled by two parameters [p. 112]: “a percentage and a condition specifying when the operation is to be invoked.”)
It is only necessary for the current generation to be maintained in memory for GP operations. Although the individuals in a population represent only a small set of possibilities in the realm of all possible programs, the genetic processes ensure the search will keep moving in the right direction–toward a solution.
This is not nature and the computer will not run the program forever, so a termination criterion must also be set. This may be a set number of generations or a 100% exact solution to a problem, which, when satisfied, ends the program.
DESIGNATING A RESULT:
Results come in different forms. There may be a 100% exact solution, but maybe there is no single best solution. If the program is terminated at a set number of generations, then the best-so-far individual is usually chosen. The fitness parameters are used to determine what is the best-so-far individual, what is the 100% exact solution to the problem, or what percentage of correctness is close enough to suffice.
The binary-bitmap conversion portion of the BBC has already been discussed and is in development. But there are many other aspects of this part of the program that need to be considered, including a physics engine and the motion of the letters. One thing that’s been bugging me is how the letters will react when they run into one another.
Let’s assume that two letters, an ‘A’ and a ‘B’ are floating around in their predetermined environment. They run into each other–what happens?
First of all, we need to clearly define what consitutes a character. There are 256 pixels in the 16×16 square, yes, but most of the border consists of ‘0’s–white space. Is it going to be considered a collision when the outer white pixels of the bitmap collide, or are we going to go with a more literal description of the letters running into each other, i.e. the black pixels (‘1’s)?
I think the second idea makes more sense, even though in computer logic the bitmap touches the other bitmap when the 16×16 squares meet, which causes a kind of predicament for us. The crazy thing about going with option 2 is that it will cause a morphing in the bitmap of at least one of the letters, even before the actual collision takes place. In other words, the ‘A’ bitmap will already change before the genetic code is activated.
Purple shows changed pixels:
0001000000010000 ‘A’ bitmap becomes:
One way around this is to make the border ‘0’s (white pixels) invisible, or transparent to the other letters. Also, we could somehow use them so that when the outer pixels of two letters collide, they activate some sort of “signal” (probably a specific function in the code) that a collision is imminent and that the decision-making procedure needs to occur. Maybe this could even change the white pixels to a different color as an extra visual aid to show that a collision is about to take place somewhere within the soup–a way to highlight the action. Just a thought.
Before: Collision Imminent!
The fight-or-flight function, another good chance for GP to help with decision-making, will then determine: Do I, as an ‘A’…
1. continue toward the ‘B’ (attack),
2. change direction to avoid ‘B’ (retreat),
3. breed with ‘B’ (perform GA/GP)?
The thing is, both letters will be running this function to determine the best course of action. It would be interesting in this scenario to see what would happen if ‘A’ tried attacking while ‘B’ decided to activate the genetic algorithm. What would happen? Who would win? Maybe the program would crash and we’d need to tweak it, but whatever–this is just a brainstorming session.
All this can be figured out. I think the main problem with the white borders in a bitmap will be aesthetics: the collision wouldn’t really look like a collision if just the outer empty sections of the letter boxes are running into each other. We can go with the highlighting/color-changing idea, or maybe we can implement some sort of overlap in the outer white pixels on the bitmaps to show more precise-looking collisions. The speed of the letter movement could also be enough to distract the user’s eye from the extra white space. Then again, as small as a 16×16 bitmap is, the difference is probably subtle enough that it won’t even be noticeable.
Over six months have passed since The Project began, and it was long before that the ideas were seeded. Pariahpism published his version of The Project origins right away. While this was an accurate description, I feel it is time to post my view on where it all started, what the past six months have involved, and where we might be going with The Project.
It began with an article published in 2007 by Alan Bellows on DamnInteresting.com about hardware evolution, “On the Origin of Circuits”. A comment in this article led me to Kevin Kelly’s book, Out of Control which is fully readable online. Now normally in my internet browsing I am not going to follow a link to a book and end up reading it in its entirety, but that is exactly what happened with Out of Control. The book starts out in Biosphere 2 in Arizona, and I thought, “Hey, I’ve been there!” And it only got more interesting as it went. Who would have thought ecology, philosophy, technology, economies and evolution had so many things in common.
Turns out Kevin Kelly is no scientist or engineer, but he is a hub of information, finding new meaning and understanding in many subjects, especially technology, with direct access to the top thinkers of our time. His blog The Technium continues to amaze and inspire me, while helping me think outside the box.
Out of Control covered a broad range of subjects, but some information seemed to stick out from the rest. I distinctly recall being impressed by the following, from Chapter 2:
In the film Batman Returns a horde of large black bats swarmed through flooded tunnels into downtown Gotham. The bats were computer generated. A single bat was created and given leeway to automatically flap its wings. The one bat was copied by the dozens until the animators had a mob. Then each bat was instructed to move about on its own on the screen following only a few simple rules encoded into an algorithm: don’t bump into another bat, keep up with your neighbors, and don’t stray too far away. When the algorithmic bats were run, they flocked like real bats.
It all came down to simple mathematical algorithms. Throw enough math into the software and, TADA! Biology plays out on your computer screen! Then, Chapter 15 dealt specifically with Artificial Evolution, while Chapter 17 “An Open Universe” expounded on that. Tom Ray and his game Tierra were highlighted, as was the GP master John R. Koza. These would soon become household names to The Project.
Fast-forward to July 2009. Pariahpism had taken to game journaling, and while not being interested in gaming I was surprisingly enthralled by his stories. My experience with software engineering combined with the memories of Origin of the Circuits and Out of Control prompted a sudden want to try to develop something of the same substance, but on a different scale. Imagine gaming journals about a self-evolving, unpredictable game! Immediately ideas started to take hold. We could provide the goals and the necessary environments, but let the games do the work!
After brief but furious discussions on the matter, Pariahpism proceeded to write about a number of game ideas brewing in our heads:
And then the serious research began to see how to go about bringing such ideas to fruition. Little did we know everything that was already out there. Computational Intelligence is a huge realm under AI which includes, among others, Genetic Programming, Genetic Algorithms, A-life, Neural Networks, and so on. Books, websites and whole development environments are relegated to these subjects, prompting us to back up a step and begin collecting information at a rapid pace.
And here we are—still collecting. I have decided to completely overhaul the old website and devote it to The Project, and video gaming in general. Why dailyvillain.com? It started in 2004 as an inside joke, turned nickname, turned URL. It even stood as an acronym on the website for a while—something like Vagabond Interpreting Life’s Lessons and Idiosyncratic Nuances—but that was way too long. Anyway, the web space was already there and now it exists and persists as The Project’s home.
So where is The Project headed? Our original goals still stand. Early versions of Alphabet Soup and various programming languages are the focus of most of our attention for the moment, but concerning the future, anything is fair game at this point. The nice thing is that the interwebs allows the unprecedented vast sharing of information, letting us build on what others have already created. The beauty of Open Source is that it is more than just convenient, it also allows for the acceleration of information sharing, and of course, innovation. That’s what The Project is all about.
For a tad more organization… This blog will help to track our thoughts and progress, while this blog has become the home for the aforementioned gaming journals, other game reviews and pretty much anything game-related.
(Originally published December 29, 2009)
The Project still requires an immense amount of reading and research before it can produce much fruit, but there is nothing to keep us from starting with the basics (except of course the risk of extinction like everyone who has ventured before us).
We obviously cannot do this completely from scratch, so to find out where to begin, we look to simple GP and GA examples. Pariahpism discovered MSDN’s example program using ants to find the most efficient way to follow a trail of food in a 32×32 square matrix. The program uses C# in the .NET framework, along with Microsoft’s System.CodeDom Namespace.
Full source code for the example program is available here.
Now, to apply the principles of the Ant/Food program to Alphabet soup, we begin by breaking the program into the comparable classes. I consider the User Interface a separate aspect of the program to be built later, but additionally there are two necessary sets of high level classes.
The first set is the problem set, which includes the customized goals and limitations. Basically, these are the details specific to our program (letter class, number class, case class, goals, movements, starting population, and so on). While the article’s program describes an ant following a path of food, Alphabet Soup will involve alphabet characters reacting to collisions with one another. A secondary but important aspect of this set of classes is a private method (or methods) included to record what happens within the program–the generational “history” which is saved to an array so the array instances can be compared for fitness level. This allows the program to use natural selection for determining “who” survives to the next generation–and, in the case of Alphabet Soup–what the future holds for those letters which collide, based in part on the fitness of past results. For the early versions of our program, the problem set and UI will take up the bulk of our customized coding.
The GP framework, on the other hand, already exists more or less. It does still have to be customized as far as the variable names are concerned, but the basic set of code instructions will be very similar to what the article defines–especially if we create our first program in C#. In other words, the GP framework is a standardized code including the necessary algorithms for mutations and inheritance.
At the core of the GP framework, the Ant/Food program uses the Generator class, which the article refers to as the “kernel” of the GP implementation. It does, after all, perform: “selection, breeding, code generation, execution, and result sorting.”
Something I’ve been seeing over and over again in GP is Koza’s “tree” full of terminals and functions. For the ant program, functions include “FacingFood” and terminals include “Left” and “Right.” For Alphabet Soup, let’s use the examples of function: “CollidingLetter” with terminals “Avoid” and “Attack.” Random expressions are created by the algorithm by combining available functions and terminals from the tree. For instance, “if CollidingLetter, then Attack else Avoid.” These random combinations are weighed against a fitness measure (defined elsewhere in the program), which helps determine the outcome of the future generations. In following traditional GP, the selected alphabet characters (a certain percentage) are copied for crossover and reproduction while the rest are discarded–depending on how exactly we choose to do it. Again, this code should be easily modified from MSDN’s version to what we need for Alphabet Soup.
The article also addresses the role of mutation in the ant program. Mutation will be crucial to how Alphabet Soup plays out:
“[Mutation] involves making a random change in a small number of offspring in order to maintain diversity in the population. Mutation is essential because otherwise the population becomes completely dominated by a few good bloodlines, which makes it difficult to continue progress beyond a certain point. [The Ant/Food program] performs a simple mutation step during the crossover operation to introduce randomness. When a terminal node is copied, a random terminal is substituted two percent of the time.”
So these are the basics of the basics to get us started. Another note to keep in mind is the usefulness of XML. The Ant program copies the ant path to an XML file which can later be reloaded. This may not have much use for the application itself, but it can be interesting for reviewing past generations and possible patterns/flaws/suggestions in the code.
Also, check out another simple example in C.
(Originally published August 2009)
GP: I was wondering if there was a simple way to make similar programs that are after the same goal, be able to talk.
“Food is over there.”
“Try a bone as a club.”
“Stay away from purple mushrooms; they’re poison.”
Bacteria have simple exchanges of information. Bees do too. People as well. Computers innately do. Just a cluttered thought.
(Originally published August 13, 2009)
The first days of The Field of Blobs tend to drag. This has been noted. Evolution takes some time; millions of years usually. Even when the life cycles are clock cycles of a multiple gigahertz computer, it’ll be a bit before anything challenging crosses your path.
In the mean time, you can make your own blobs. The player draws up some simple plans, blueprints, prototypes or whatever. Those plans are put through maybe a GEP algorithm to find the closest match to what’s available in the possible genomes preset in the blobs. I can’t describe how incredibly vast we’ll have to make the basic genome of the blobs to make it as diverse as possible, as to consistently evolve from chaos into something random yet still designed. When birthed, the player blobs can be pitted against each other or even released into the wild. The domestic blobs have an advantage against the feral blobs at first due to the artificially enhanced evolution rate and designed purpose. But soon their genes will spread into the feral gene pool and become another source of diversity, quickening the evolution and sure death of all players in the field.
Those are not set free are battled against each other. Mostly it’ll be a rip-off of Pokémon. But I guess in any game that allows breeding also has stupid tournament fighting. Sigh. God I hate conventional things. I’ll do I can to keep them out of these games but often they’re still expected.
I’d like to see the choice made by the GEP algorithms that any blob not easily survivable with player give tools and abilities be survivable in different ways. If they’re made fuzzy and cute, they’ll have to breed like crazy and have super short gestation time and turnaround. If made with certain advantages and no flaws, they’ll have to have other weaknesses. How do porcupines breed? I’ve no idea. But it’s something the player will have to think about when making the sea-urchin blob. Those are the challenges you’ll be faced with.
Maybe there will be a limit to the cycles given towards evolutions in the artificial evolution pit. There will be a severe difference between the growth mutations of asexual and sexual blobs. It’ll be another choice to make when designing your blob. Also levels of care should be considered. You can breed in traits/temperaments to your blob: Aggressiveness, skittishness, isolationism, pack hunter, loyalty to their own, loyalty to you, etc.
The Trait Problem: I have no idea how to have the base traits like these already present but not active in the basic blob. This game is supposed to be the simplest concept of ours and it’s soon becoming the more and more unimaginable to program. Oddly it’s following the evolving patterns of life; started simple but becoming exponentially more complicated. I read an article about retaining learned behavior in Video Game AI while promoting further need to learn. This could possibly solve the trait problem but it’d have to be a sublevel of memory, genetic memory, or even an analogue for instinct.
I’m still not very adept at programming but there will have to be two levels of learned behavior. One will be the Instinct Layer and the other will be the Cognizant Layer. The IL will have rudimentary skills all life eventually develops. Why deer run from everything. Why rabbits eat what they eat. Why beavers mate for life. This will all be captured, stored, and expressed at this layer. The entire genealogical history for this strain of blobs will be here. The trick will be making access to the storage somewhat muddled. I don’t remember when my monkey ancestor got mauled by a jungle cat. I just know to be scared of them.
The Physics Problem: How can you inspire real world genetic solutions without a near perfect physics engine. We got our legs from needing to move and reach things only a biped could. It can’t be a simple value hiding behind a generic choice. The choice between a movement-speed of 1 vs. 2 will always 2. We can’t have a large selection of choices in locomotion. Three bits of choices, bipedal, back-jointed bipedal, six legged, eight legged, multi-legged, wheels, slither, ooze, and whatever else nature does, cannot be called random or evolved. Those are just data sets already available. Even with a clever rule-set–pros and cons for each—can I accept this over genuine evolution, which is why I’m so enthusiastic about this project. The whole point is to not know what conclusion will be drawn. We must provide the problem of locomotion and let GP solve it uniquely.
In my mind, I picture a blob with a grown tentacle evolved for grabbing food missing and hitting the ground cause it to lurch quickly in a certain way. This leads to purposeful whipping for better movement. Others will learn to roll. Others may even learn a complex way of undulating their blob bodies. Maybe a top-of-the-line physics engine could handle what’s necessary, maybe. As an independent company, we won’t have one of these. Maybe when we start up development, we’ll have a moderately capable one that can at least lend to some quality physics and provided a good enough backdrop for artificial evolution.
Note: This is not a problem for Alphabet Soup. Simple physics and their starting shape should work gloriously in spurning interesting evolution.