Genetic Programming in C#

C# may not be the best language to use for Genetic Programming, but as these examples show, it can work. Our interest in C# Genetic Programming happened as a result of C# becoming our language of choice for the Project, almost by default. In our pursuit of C# GP sample code, we found a number of resources worth sharing:

This MSDN article by Brian Connolly is a great place to start. The code may be getting a little outdated, but I compiled and ran the “ant food-finding program” just fine. The article is detailed in its explanation of the code and how the GP functions. One cool thing to me is how the program also prints out the code of the evolved genetic programs, and labels each ant by its corresponding generation and individual number.

MSDN Ant Example GP
MSDN's Ant GP

There are several Genetic Programming libraries that are written in C# and fully available online:

GPE (Genetic Programming Engine) is a library inspired by the preceding MSDN article. It has plenty of documentation available online. I have downloaded the library but have not had any free time to play around with it yet.

Aforge.NET has become our primary GP library to play around with so far. Documentation and samples are good and up-to-date, although the bread and butter of Aforge deals more with Imaging, so the GP (and GA, and GEP) portion is not as expansive as some of the other libraries we’ve found.

Aforge GP
Time Series Progression with Aforge

SLNGP (Silverlight and .Net Genetic Programming) has a great looking library, but next to no documentation to support it.

Honorable mention goes to this blog about Lambda GP, and another potential for good GP source code in C#, “Gene Hackman.”

In future blogs we’ll extend the search for quality Genetic Programming sources in other languages, such as LISP and Python, and we’ll share our experiences along the way.

Stumbleupon Twitter Google Buzz Digg it del.icio.us Facebook Reddit Yahoo Buzz
Tagged with:
 

The Project. One Year In.

One year ago today The Daily Villain Project commenced.

Villain Logo

It has been a busy six months since our last update. Our main objective right now is to complete a release-worthy version of our first game, Alphabet Soup. We are in phase II out of III toward completion of this game, and making progress every week. We stopped adding updates to our website concerning the game back in May, as we are doing a lot of behind-the-scenes work right now. As soon as AS is done, and maybe even before that, we plan to delve into any of several other game ideas we have been tossing around, and we are excited about that.

Besides game development, we have a few other things going on. First of all is dailyvillain.com, which we have not updated much since burrowing into Alphabet Soup dev. However, we have seen hits sky-rocket over the past six months, and we have been brainstorming on how to make it a better site, which we will be doing quite soon. We aim to have a more complete Computational Intelligence Library and Link Archive, along with an entirely new website design.

Alphabet Soup
Alphabet Soup Screen Shot

Sniderweb, LLC also has a website now and will be more involved in The Project, along with providing website design for small companies—after we get everything together.

Perhaps the most promising and most frequently-updated aspect of The Project right now is the blogging. The Project blog was a main update source for Alphabet Soup, but now that we are keeping AS under wraps, we have shifted the blog's focus more to Artificial Intelligence in software and other aspects of game design. Pariahpism is busy reviewing games in his gaming journal, which is always a good read. We will strive to keep these blogs more updated and current from now on.

Stumbleupon Twitter Google Buzz Digg it del.icio.us Facebook Reddit Yahoo Buzz
Tagged with:
 

Good Problems for Alphabet Soup

Let’s back up a step. First of all, what do we want in the user experience? 1. We want an engaging game experience. Draw in the user quickly, and then keep them enthralled. This means things need to evolve rapidly into something interesting. 2. A rich environment. With all we are planning, this should be no problem. I see the main problem being the balance between functionality and complexity. We don’t want to bog the game down to the point that it is slow and uncooperative. I think we can achieve this delicate balance through abstraction, while making it richer at the same time. 3. Evolutionary Computation. We want this to work. We want realism. That’s where the entertainment comes in. We want letters to be floating around in Letter Land, colliding with each other, spawning populations of new letters, mutating, reproducing, overcoming obstacles, eating, living, and fighting for survival as realistically as possible. Now the details, as I see them. To make all of this work, we need three layers (so far): 1. The Cellular Layer – This is dealing with the letters on a cellular scale. Binary bits “alleles” make up the chromosomes (strings) and ultimately letter genomes (2d arrays). The letter library, the comparator and the schemata all operate at this level, being driven by our Genetic Algorithm. Keep in mind, the body cannot live without its cells. This is where the cells are operating on a micro level. The user can only see this by clicking to view a detailed history of any given genome (probably an XML or text file). This is where we’ve been doing all our work so far, completing Phase I of the project. So far, so good. 2. The Physics Layer – This is dealing with the letters on a population scale. The Physics Layer is special because this is where the abstraction exists. The letters take on a new form as physics bodies governed by physical laws and shapes / geometries instead of genomes full of binary chromosomes. The Physics Layer contains the full map of Letter Land. Herein lies the core of the visible evolutionary processes (Genetic Programming, Neural Networks? ). This is where we are at now (phase II), and this is where the problems crop up. 3. The Window Layer – This is dealing with the letters on a body scale. The window is a “chunk” of the globe of letter land, where the user is zooming in on the action. Essentially, this window or magnifying glass is the user interface to Alphabet Soup. Any user interaction we decide to include (implementing new obstacles or food, for instance) will happen here. Layers DM = Decision Maker; FFGP = Fitness Function Genetic Program; GA = Genetic Algorithm; EGP = Environmental Genetic Program With all that in mind, let’s break down our problems, keeping the “functionality versus complexity versus realism” issue at the forefront. Problem 1: Decision-making. Two letters, a C and an R are about to run into each other. What happens? The sensing organs for both letters realize a collision is imminent, and call on what we’ve proposed as the DGP (Decision-making Genetic Program). Its options are simple: Breed, Flee, or Fight. - Whenever Breed is chosen, the GA at the Cellular Level is called and run. This has a direct impact on the appearance (therefore, geometry and physics) of the letter at the Physics/Window Layer. - Whenever Flee is selected, a given amount of thrust is applied to the letter to get it away from the opposing letter. - When Fight is selected, the letter will use whatever is at its disposal to try to kill (?) the other letter. Think: war over territories. The DGP is only used for this about-to-collide decision-making procedure. I figured on using the EGP (discussed next) to make any other health/fitness related decisions at the Physics Layer. The reason I chose GP for this decision scenario is because I was thinking of the ant-finding-food problem where its only options were RIGHT, LEFT, FOOD, which could actually just as easily be solved with a GA. Here’s the thing, this “Breed, Flee, Fight” scenario is even simpler than that, which begs the question: Does this step need to involve Genetic Programming at all? It is a simple choice between three options, and it is very localized to the exact moment when the sensing organ calls on it. So my argument is to do away with the DGP, instead allowing this decision to be completely (wince) random, or something else entirely—such as Fuzzy Logic or Neural Networking—although this would involve a different approach taking into account all of the decision at one time. The only way I see GP being beneficial here is if it were somehow tied into how the decision affects the population as a whole. But here’s my take on that: we do want complexity, yes, but I feel using a GP here is unnecessary and overly complicated, when we could be using EGP instead to be causing the complex, exciting evolution to take place in the Physics Layer. But I have a better idea yet. One way to keep it from being random but still not requiring too much processing is to give it a set of circumstances, such as: “if R population is greater than 100 and R population is declining, avoid, else attack.” Some simple lines of code like that will cause the about-to-collide letter to make fast, simple, but smart decisions. I think this is the ideal way to do it. To use the if/then approach, the Decision Maker must have access to information about the current stats of the population, yes, but they do not have to be drawn from the EGP. Instead, I propose we have a pool of stats about the entire population of letters through the use of a number of variables, then the various functions (EGP, Decision-maker, and so on) access the data and use it as they see fit. This object oriented approach modularizes the information and makes the most sense to me. Any of these ideas would allow these simple, but constantly-taking-place collision decisions to not use so much processing power unnecessarily, thus solving the “too slow” problem. And realism is maintained, by the simple implementation of the if/then statements, and more thoroughly through the use of the EGP (problem 2). Problem 2: How do we make the Environmental Genetic Program (EGP) function? Well, I think this will play itself out. As long as we keep the pool of variables available to any function, the key to a good EGP will be to assign it a good fitness function determinable by the variables it accesses. This will likely include the physical attributes of the individuals as well as the population as a whole. I think the EGP will be the core to making the game an enjoyable experience, so we need to do it right and think it through. I think we should keep the other facets of EC relatively simple throughout the program, and focus all of the complexity here. And no, we are not there yet. Problem 3: How are we going to tie the layers together? I saved this one for last because to me it is the most complicated problem to figure out. We have a vague idea for the FFGP (Fitness Function GP), which will work as follows: Letter ‘C’ has to get around a flying asteroid, but its friction causes it to slow down to the point where the asteroid is going to collide and destroy it. It needs to evolve to be able to evade, so it grows a wall on its right side, thus becoming an ‘O’. This in itself is a problem: How do its genes, being abstractly separate from the physical body, know that the letter is now an ‘O’ and no longer a ‘C’? We’ve proposed using the FFGP to evolve a fitness function at the GA (Cellular) Layer for the GA to evolve to meet the needs of the Physics Layer. This is all still blurry to me. I think it would be logical that if the C evolves to an O in the Physics realm, this information should be sent to the Cellular Level, matched up to the schemata, and transformed to an O. To me this is still sufficiently complex and realistic, as the EGP at the Physics Level will still be performing the evolution to the different letter, while requiring the GA to “catch up.” The two layers have to work together somehow, otherwise what’s the point of having both layers? Or maybe the Cellular Level can have access to the same data as the Physics Level, and read from the EGP to determine what is going on and how to evolve, but then where is the abstraction—and that all seems too process intensive anyway. This is a point of contention. The other tie-in between layers is the thread between the Physics Layer and the Window Layer (User Interface). This involves the physics processes constantly going on behind the scenes at the Physics Layer, while using the Window Layer to just focus in on wherever the user wants to look in the population. We will likely not even get to this problem in phase II because we will be starting with a small enough population to watch it play out on a single screen. These are all good problems because it shows we are making progress. We should start simple at each layer, then add layers of complexity as we get the pieces to work. Next I want to focus on getting a working Physics Layer with Blend and Physics Helper 3. Once we do that, we can deal with the problem combining it all with the Cellular Layer.
Tagged with:
 

Neural Networks

Well hell, we're now reaching the point in The Project where not everything can be solved with some tangle of Genetic mishmash. I blame myself partly for letting my brain assume this for little problem’s answer. Now when we're actually figuring out the data flow, it's not nearly that simple. But the structure was good. We just need to replace a few things. Our DGP (Decision Genetic Program) had for me the problem of scope. We assumed complex choices could be evolved from the data passed to it from the EGP (Environmental Genetic Program), something that also evolved complex physicalities such as sight and color. The question that arose was, "When do these GP's increment their generation?" The GA does doing the Characters either Cross Over or Mutation. This would leave possibly the Characters blind, deaf (un-evolved sensing organ) and dumb (very poor/random decisions) for thousands of Generations. That would make things accurate...but very boring for something targeted to an audience with a five minute attention span. Problem: Too slow, albeit realistic. Solution: We could brew (literally) some templates to load into basic Characters. Sub-Problem: Things would still be slow to change. Likely the player would never know or see these changes. Sub-Solution: Unknown The alternative would be to let the two GPs increment Generations freely, once every cycle of the Game Loop. The problem here is that a Character could potentially change its entire advanced physical body and behavior within its lifetime. A Dear could suddenly act like a Wolf with the eyesight of an Eagle, if it saw fit (pun intended). This siphons both the realism and some of the static-ness of the game. Problem: Unreal and far too dynamic. Solution: Maybe a limit to the amount of Generation increments? Sub-Problem: Still fake and not addressing the real problem. Sub-Solution: Unknown. I mentioned scope before as the source of the confusion. That’s because as a whole, The Project is a GA within a GP within a GP, each with its own population and fitness. The user only interacts with the outer most layer. Its population are the Characters that house the FFGP, the EGP and the DGP. The FFGP generates the fitness for the GA layer. Problem: We don’t know how to do this. Solution: Figure it out. The DGP takes the information gathered each turn, (Hungry = True/False; Sensing Organ = True/False; Bored = True/False; etc.) to make the choice (Move; Attack; Reproduce; Nothing; etc.). Problem: GPs aren’t usually used this way. The data sets are usually static and the program is left to evolve better and better ways. Solution: Maybe a GP could be made to handle each state and called upon only when certain states are true. Sub-Problem: This feels like micromanagement and could impede growth potential. Sub-Solution: Learn more. The EGP handles the advanced physical attributes (Sensing Organ’s attributes; Exterior Attributes; etc.) This should work as intended with a basic template. Simple a bunch or attributes being juggled for a fitness value. This could even be a GA unless I’m totally forgetting something. All that the DGP should do is making it seem like it could be a Neural Network. But we need to learn more about them before making any lasting decisions. Problem: If the EGP becomes a simple GA and the DGP becomes a NN (Neural Network) then we’ve removed all the GPs from the program that have a significant part to play. This kills a lot of the dreamt unknown potential from the game. Solution: Hmm.
 

Defining Fitness

Keeping Fit What we have so far: Having a good definition of fitness is going to be crucial to our success with Alphabet Soup, and any of our games for that matter. So far we have only one fitness factor, a rather rudimentary "meatiness" goal: the more 1s, the better. Aiming for the maximum fitness of 1, we define an ideally fit chromosome of 16 binary "alleles" as 16/16 =1, or: chromosome = 1111111111111111 = 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 = 16/16 = 1 = 100% fit. Anything less than ideal is given a fitness value such as 13/16 = 0.81, or: chromosome = 1111110111011101 = 1+1+1+1+1+1+0+1+1+1+0+1+1+1+0+1 = 13/16 = 0.81 = 81% fit. This is all very simple. Too simple. But it's a start. The question is, where do we go from here? In progress: The next step will be to include the schemata for the letters. Fitness will be a combined value of the "meatiness" and the "likeness" defined by a comparator function. Here is how it will all work: First we will have the letter genomes in the Binary Letter Library broken down and defined into the final schemata: charLine[0] = 0*****0000000000 ... lines 1 thru 14 ... charLine[15]=0******0000***00 Where there is a "0", the value is always zero--in any letter in our population. Why calculate that if it is never going to change? The remaining values then, represented by asterisks, will be either 1s or 0s, depending on the letter. The results of the GA generations will be run through the schemata, with the remaining asterisked positions matched up against the letters in the library. Matching to the nearest letter will be easy. Say there are 125 asterisked positions. After the first generation of GA takes place, the fittest-so-far (meatiest individual) is passed through the schemata. All zero-only spots (such as charLine[0][0], charLine[0][6] are dropped immediately. All asterisk "either-or" positions are evaluated. If charLine[0][1] is a '1', it is matched to all letters with a value of '1', and so on. The genome will then take the shape of the letter with the most matching values.... An important technical aspect of this that we have agreed on is to have the comparator only change the way the letters are displayed, while leaving their array intact--aside from the schemata. In other words, the original GA-defined chromosome/genome is maintained, while a copy of itself is stripped to be matched against the schemata. This allows it to maintain its evolutionary integrity with a level of abstraction so more fitness functions can be performed. I will deal more with this later. So how do we define the new fitness value, with both the schemata (likeness) and meatiness in mind? Here is a 5-step process that should do the job: 1. Take value of current generation individual. Let's use for genInd[5], say: 0000101011101110 2. Pass through schemata. Let's use generic letter line 5, which might be 00*****0******00, making genInd[5] now equal 0000101011101100 (the only bit that changes is the second-to-last one, because based on the schema that bit always equals 0). 3. Define "likeness value." For simplicity's sake let's use charAline[5] which equals 0000010001000000. For all bits of genInd[5] that match charAline[5], add a 1: So, comparing 0000101011101100 (genInd[5]) to 0000010001000000 (charAline[5]), and disregarding schemata characters that are always 0 (blue), we find only 4 bits that match (red), giving genInd[5] the "likeness" value of 4 as compared to charAline[5]. 4. Define the "meatiness value." To incorporate the "meatiness" factor, let's add an additional +1 to each value of matching letter that is, actually, "1" (meat). When we add up all the matching value of 1s between genInd[5] and charAline[5], we only get +1: 0000101011101100 compared to 0000010001000000. So, for this particular example, the meatiness value = 1. 5. Add "likeness value" and "meatiness value".To get the total fitness value, add the likeness value (4) to the meatiness value (1) for a total value of 5. Note: since this schemata line has 5 characters that are always equal to 0 (shown here: 00*****0******00), they are subtracted from the maximum fitness amount. So, the maximum fitness of this line is 16-5 = 11 possible values that match ("likeness" value), plus the possibility that all of those values are 1s ("meatiness" value), making the total maximum fitness of this line equal to: 11 (max "likeness") + 11 (max "meatiness") = 22. So in our example, 4 (likeness) + 1 (meatiness) = 5 was not all that great (5/22 = 23%). Other letters in the alphabet will much likely have higher fitness on this line. I think this will be a good way to determine total fitness--at least in early versions of Alphabet Soup. Future awesomeness: All technicalities aside, this is where Alphabet Soup can get really interesting. This will probably be around... version 3.0 or so. Right now, these are just vague ideas. Fitness based on letter shape was the original goal of Alphabet Soup all along:
"Hark! Here comes a second generation character, ‘i’, who has lost the vestigial tittle (the dot). Most other letters are unchanged or changed in a negative or neutral way. The lower mass of the ‘i’ makes it faster and take in less calories. Alas, it’s lower weight makes it do less damage and is losing ground to the dastardly ‘T’s and their hearty top most shield. Will the character genes be favorable enough to carry on or will they be lost and another random variation will make itself dominant. At the front of ‘i’s’ and ‘T’s’ war, a fork occurs with the ‘w’s. The ‘w’s strike fast coming down but their double point distributes the impacts, lessoning the effect on harder shelled letters like the ‘T'."
With this, things like food, environment and the ability to reproduce/fight/defend are all important. Another idea is this: fitness is dynamic, determined by the players. They change the landscape--by adding obstacles, giving new goals (such as winning a race, building a tower, et cetera). Different letters will be more suited to different tasks, so fitness will vary from task to task. This might keep it more interesting than if fitness was just based on one thing (like meatiness or shape). Also, on the note of shape mattering--maybe shape won't matter as much as far as fitness value is concerned, however shape may be important for movement in Farseer if we can get the physics of it right. So... if the goal is to get over a mountain or something, the shape and physics will be important to the letter's success, indirectly having an effect on the fitness anyway. A good comparison is to think of how to keep a Sim alive: by balancing a variety of values that are important to the character's success at staying alive, such as social value, food, health, activity, recreation, knowledge, and so on. Any of these kinds of ideas could be incorporated into the fitness value to make it more interesting and keep the player more involved in what is happening in the Soup. One important key to this is to allow the Evolutionary Computation to do its thing without interferring. If we can hone that delicate balance between the EC and the player-letter interaction, Alphabet Soup could be great. The future awesomeness of Alphabet Soup depends on our ability to come up with more fitness ideas, and to make them actually work. Other considerations: We have talked a lot about including sensing organs for letters to be able to interact with their environment on a biological level. Also in discussion are color values which may or may not be related to fitness, sensing ability, or overall health of a letter. We will have to deal more with these ideas and see what we can agree on. One thing we have reached consensus on is the level of abstraction that will be present in our final product. Overall, we are going to let the GA and GP do their things in a very precise way, keeping the Evolutionary Computation aspect as pure as possible--but this will all be going on behind the scenes. The precision in the background will be modified and "rounded" (such as with the schemata) before reaching the eyes of the player. This will allow the letters to maintain their forms, allow the Physics Engine to operate on fixed shapes, and so on. However there will likely be a more abstract level of EC going on as well, such as with the shapes present in the physics engine and how they react to one another. The more levels of EC operating within the game, the more rich the environment and the more capable it will be of surprising us. But we don't want the abstraction to completely cover up what is going on behind the scenes. One way to be able to unite the abstraction with the player's awareness of what is going on is through keeping a detailed history of each genome in the population. We should somehow tie the chromosomes to their respective genome so that the genome holds its identity and evolutionary lifespan throughout its own history. Maybe stamp each chromosome with some sort of hidden i.d. assigning it to its current genome. These generational histories will provide the owner of the letter with information showing how old the letter is (how many generations it has survived) and each detailed transformation it has undergone during its evolution to its current state. This could be useful for farming, trading or even--who knows--combat? Dog-fighting is illegal, but how about pitting our letters against each other? There are all kinds of scenarios and places we could go with Alphabet Soup. And as the creators of Dwarf Fortress believe, the more ideas we integrate, the better the game might become.
Tagged with:
 

The Formula For Facebook Games

-True friends list: Gaming can now happen exclusively within the context of one’s actual friends. Multiplayer games no longer suffer from the catch-22 of requiring friends to be fun while new players always start the game without friends. -Free-to-play business model: New players need not shell out $60 to join the crowd. Consumers don’t like buying multiplayer games unless they know that their friends are all going to buy the game as well. Free-to-play removes that friction. -Persistent, asynchronous play: Finding time to play with one’s real friends is difficult, especially for working, adult gamers. Asynchronous mechanics, however, let gamers play at their own pace and with their own friends, not strangers who happen to be online at the same time. -Metrics-based iteration: Retail games are developed in a vacuum, with designers working by gut instinct. Further, games get only one launch, a single chance to succeed. Most developers would love, instead, to iterate quickly on genuine, live feedback. Taken from here
 

Development Notes

There have been some exciting developments in Alphabet Soup. We now have a functioning Binary-Bitmap Converter, and a working Genetic Algorithm. I thought I'd make some notes on the first few successful runs of the GA, so here goes: For the simplest Alphabet Soup chromosome, we are using a 16-bit binary string which, while all 1s and 0s, isn't functioning as a true binary string. Instead, for our simplest calculation of fitness, we are assigning the "meatiest" chromosome (the one with the most 1s) as the fittest individual. So instead of going by the decimal value of 0000001000100101 (which is 549) we are adding the 1s for a value of 4. This cuts back on processing power, especially when dealing with a full 256 bits, and it allows us some easy numbers to work with for churning out a fitness value of any given chromosome. So, the optimal value is 16, represented as: 1111111111111111. Success! Success! Our first run of the GA striving to reach this value with a random starting population has been a surprising success. Reproduction and crossover rapidly increase the fitness no matter what the starting population. The GA often converges on not-quite ideal solutions, but with enough generations the mutations do eventually converge on the correct string. Our mutation rate is set to 10%. The other parameters are: min:0, max:2, crossover:80%, length:16. These parameters seem to work quite well. I changed the population paramater to see how that affected the runs, and the results were interesting. The following chart describes our first six runs of the "Alpha" Alphabet Soup Genetic Algorithm:
Population    Generations to converge    Total number of individuals    
25201040
27911582
42731092
43761504
5042 (w/ 2 mutants)2100
5063 (w/ 1 mutant)3150
I found it interesting that convergence seems to happen faster with a smaller population. I was especially surprised at the efficiency with which only two parents managed to converge on the correct solutions. The least number of individuals required to converge was 1,040, which happened with only two in the population, while the longest run required 3,150 individuals--using a population of 50. That's still not bad, considering the search space contains 65,536 possibilities.
Tagged with:
 

Character Collisions

The binary-bitmap conversion portion of the BBC has already been discussed and is in development. But there are many other aspects of this part of the program that need to be considered, including a physics engine and the motion of the letters. One thing that's been bugging me is how the letters will react when they run into one another. Let's assume that two letters, an 'A' and a 'B' are floating around in their predetermined environment. They run into each other--what happens? First of all, we need to clearly define what consitutes a character. There are 256 pixels in the 16x16 square, yes, but most of the border consists of '0's--white space. Is it going to be considered a collision when the outer white pixels of the bitmap collide, or are we going to go with a more literal description of the letters running into each other, i.e. the black pixels ('1's)? I think the second idea makes more sense, even though in computer logic the bitmap touches the other bitmap when the 16x16 squares meet, which causes a kind of predicament for us. The crazy thing about going with option 2 is that it will cause a morphing in the bitmap of at least one of the letters, even before the actual collision takes place. In other words, the 'A' bitmap will already change before the genetic code is activated.                     Purple shows changed pixels:                     0000000000000000                     0000000000000000                     0001111111000000                     0001111111100000                     0001000000110000                     0001000000010000                     0001000000010000                     0001000000110000                     0001111111100000                     0001111111110000                     0001000000010000     'A' bitmap becomes: 00000000000001000000010000     0000000000000100 00000001000001000000110000     0000000100000100 00000011100001111111100000     0000001110000111 00000010100001111111000000     0000001010000111 00000110110000000000000000     0000011011000000 0000010001000000                         0000010001000000 0000110001100000                         0000110001100000 0000100000100000                         0000100000100000 0000100000100000                         0000100000100000 0001111111110000                         0001111111110000 0001111111110000                         0001111111110000 0001000000010000                         0001000000010000 0011000000011000                         0011000000011000 0010000000001000                         0010000000001000 0110000000001100                         0110000000001100 0000000000000000                         0000000000000000 One way around this is to make the border '0's (white pixels) invisible, or transparent to the other letters. Also, we could somehow use them so that when the outer pixels of two letters collide, they activate some sort of "signal" (probably a specific function in the code) that a collision is imminent and that the decision-making procedure needs to occur. Maybe this could even change the white pixels to a different color as an extra visual aid to show that a collision is about to take place somewhere within the soup--a way to highlight the action. Just a thought. Before:                                             Collision Imminent! 0000000000000000                         0000000000000100 0000000100000000                         0000000100000100 0000001110000000                         0000001110000111 0000001010000000                         0000001010000111 0000011011000000                         0000011011000000 0000010001000000                         0000010001000000 0000110001100000                         0000110001100000 0000100000100000                         0000100000100000 0000100000100000                         0000100000100000 0001111111110000                         0001111111110000 0001111111110000                         0001111111110000 0001000000010000                         0001000000010000 0011000000011000                         0011000000011000 0010000000001000                         0010000000001000 0110000000001100                         0110000000001100 0000000000000000                         0000000000000000 The fight-or-flight function, another good chance for GP to help with decision-making, will then determine: Do I, as an 'A'... 1. continue toward the 'B' (attack), 2. change direction to avoid 'B' (retreat), 3. breed with 'B' (perform GA/GP)? The thing is, both letters will be running this function to determine the best course of action. It would be interesting in this scenario to see what would happen if 'A' tried attacking while 'B' decided to activate the genetic algorithm. What would happen? Who would win? Maybe the program would crash and we'd need to tweak it, but whatever--this is just a brainstorming session. All this can be figured out. I think the main problem with the white borders in a bitmap will be aesthetics: the collision wouldn't really look like a collision if just the outer empty sections of the letter boxes are running into each other. We can go with the highlighting/color-changing idea, or maybe we can implement some sort of overlap in the outer white pixels on the bitmaps to show more precise-looking collisions. The speed of the letter movement could also be enough to distract the user's eye from the extra white space. Then again, as small as a 16x16 bitmap is, the difference is probably subtle enough that it won't even be noticeable.
Tagged with:
 

Bitmap-Binary Basics

The Bitmap-Binary Converter will likely be the core engine for at least the early versions of Alphabet Soup. To create it, there are a few things we need to consider. The two most glaring questions are how will we read individual pixels, and how will the actual conversion between pixel color and binary code take place? An even more basic consideration is which programming language we will use. I am leaning heavily toward C#, as it seems fully capable of tackling both problems. There are a couple of ways we could do this. Let's start with the assumption we begin with the bitmap and define all the colors for a given letter ("A", for example). This website explains how to access each pixel individually--exactly what we are trying to do. This is the core of the code allowing the color definition for each pixel based on location within the bitmap: public void SetPixel(int x, int y, PixelData color)     {         PixelData* pixel = PixelAt(x, y);         *pixel = color;     } However this is using 'unsafe' code to access each individual pixel. To use 'safe' code the actual program execution would take forever. So there may be a third alternative worth looking into. And here are more thoughts about this. This is something we are going to have to experiment with to find the best method, or we might have to figure out our own. To save the bitmap code for future use, we will have to set the black pixels to 1 and white pixels to 0. This will produce 256 characters, ranging from charA[0] to charA[255]. We might have the program write this data to a separate file, or just store it within its own memory. For visual reference, here is the bitmap in actual size: A Here is the 'A' bitmap represented in binary: 0000000000000000 0000000100000000 0000001110000000 0000001010000000 0000011011000000 0000010001000000 0000110001100000 0000100000100000 0000100000100000 0001111111110000 0001111111110000 0001000000010000 0011000000011000 0010000000001000 0110000000001100 0000000000000000 Here is the above bitmap referenced in pixel notation (shortened to describe the four corners): (0,0) through (0,15) ... (15,0) through (15,15) And here is the character equivalent to the above pixels: charA[0] = (0,0) = value of '0' charA[15] = (0, 15) = value of '0' charA[240] = (15,0) = value of '0' charA[255] = (15,15) = value of '0' Or, we do it the other way: Start with 62 '.txt' files (52 letters and numbers 0-9, or whatever amount we end up with for the base set of characters), each with 256 "0" and "1" characters representing the 16x16 bitmap. Read the text from the file and use the code "if charA[0]='0'..." / "if charA[0]='1'..." and the earlier bitmap code to convert each corresponding pixel to black or white by referencing the corresponding pixel within the bitmap. Whenever crossover occurs in the program, it creates a new binary (or .txt) file to represent that organism. I think a thorough understanding of the C# StringBuilder Class is going to come in handy for this, in splicing strings of binary characters and performing other modifications. Again, this will be closely intertwined with the GP or GA code, which will be the determining factor for where the splicing will occur each time. Basically we're accomplishing the same thing but from two different directions: bitmap-to-binary or binary-to-bitmap. It will probably be valuable to know how to do this in both directions for purposes of the program anyway. Basic C# data type conversions (string-to-binary-to-decimal-to-chars-to-string and so on) will be instrumental in making this work properly. This should actually be pretty simple; it will just take some time to sort out the details. What I described above should accomplish the basic functionality of the BBC; once we get this working we can move onto the engine's more complex functions, such as attributing higher "sensitivity values" to certain pixels and recognizing "contact" with other letters.
*Define Initial Population* The initial population in the final version of Alphabet Soup may likely contain one of each keyboard character available. For now, for simplicity's sake, let's begin with 4 characters: A,B,a,b. In our game, all 4 characters decide to run into each other at once. First we will convert the 16x16 characters into binary, 0 for gray and 1 for yellow: A: 0000000000000000 0000000100000000 0000001110000000 0000001010000000 0000011011000000 0000010001000000 0000110001100000 0000100000100000 0000100000100000 0001111111110000 0001111111110000 0001000000010000 0011000000011000 0010000000001000 0110000000001100 0000000000000000 A, Binary String: 0000000000000000000000010000000000000011100000000000001010000000000001101100000000000100010000000000110001100000000010000010000000001000001000000001111111110000000111111111000000010000000100000011000000011000001000000000100001100000000011000000000000000000 A, Binary-Decimal conversion: 95786086615602871731981918389383308042547593781186608 B: 0000000000000000 0000000000000000 0001111111000000 0001111111100000 0001000000110000 0001000000010000 0001000000010000 0001000000110000 0001111111100000 0001111111110000 0001000000010000 0001000000010000 0001000000110000 0001111111100000 0001111111000000 0000000000000000 B, Binary String: 0000000000000000000000000000000000011111110000000001111111100000000100000011000000010000000100000001000000010000000100000011000000011111111000000001111111110000000100000001000000010000000100000001000000110000000111111110000000011111110000000000000000000000 B, Binary-Decimal conversion: 46403387827017775033653233413204161160929512263696 a: 0000000000000000 0000000000000000 0000000000000000 0000001111100000 0000011111110000 0000110000110000 0000000000010000 0000000111110000 0000011111010000 0000110000010000 0000110000110000 0000111001110000 0000011111011000 0000000000000000 0000000000000000 0000000000000000 a, Binary String: 0000000000000000000000000000000000000000000000000000001111100000000001111111000000001100001100000000000000010000000000011111000000000111110100000000110000010000000011000011000000001110011100000000011111011000000000000000000000000000000000000000000000000000 a, Binary-Decimal conversion: 86418088698874722540000455600782178239016967 b: 0000000000000000 0000100000000000 0000100000000000 0000100000000000 0000101111000000 0000111111100000 0000110000110000 0000100000010000 0000100000010000 0000100000010000 0000100000010000 0000110000110000 0000111011100000 0000101111000000 0000000000000000 0000000000000000 b, Binary String: 0000000000000000000010000000000000001000000000000000100000000000000010111100000000001111111000000000110000110000000010000001000000001000000100000000100000010000000010000001000000001100001100000000111011100000000010111100000000000000000000000000000000000000 b, Binary-Decimal conversion: 766259462624453036391004127445265907255673543052767246 So, our Initial Population (POPinit) consists of the following four characters and their decimal representation: 1. (A) 95786086615602871731981918389383308042547593781186608 2. (B) 46403387827017775033653233413204161160929512263696 3. (a) 86418088698874722540000455600782178239016967 4. (b) 766259462624453036391004127445265907255673543052767246 *Define fitness of POPinit* First, we establish the Total, the Best, the Worst and the Average based on their decimal representations: Total: (A+B+a+b) 862091952714301014596894421608062875060164244585234517 Best: (b) 766259462624453036391004127445265907255673543052767246 Worst: (a) 86418088698874722540000455600782178239016967 Average: 23958122541618481117083899450474345137254323064508656.25 Based on these findings, we can calculate the chance of Reproduction / Survival for each character. A: 0.1111088977388315456010566688830799217993208329796695901253... = 11% chance of reproducing B: 0.0000538264945878643991068686448919016779835612586246604953... = <1% chance of reproducing a: 0.0000000001002423099146058828565205588056220034311729841463397... = MUCH <1% chance of reproducing b: 0.8888372756663382800852305796155076177170736023305327652329... = 88% chance of reproducing *Perform Reproduction Function* As you can tell from the fitness of each character, the character "b" has by far the best chance at survival, while character "A" is the only other letter with much of a chance at all. In the actual game, the reproduction function will decide who actually survives reproduction so it is impossible to say for sure, but based on these percentages we will go with three b's and one A surviving to population generation 0. So, the result of the reproduction function should look something like: 1. (b) 766259462624453036391004127445265907255673543052767246 2. (b) 766259462624453036391004127445265907255673543052767246 3. (b) 766259462624453036391004127445265907255673543052767246 4. (A) 95786086615602871731981918389383308042547593781186608 *Define fitness of gen0" Again, we determine the total, best, worst and average fitness of the new population, gen0: Total, better: (b+b+b+A) 2394564474488961980904994300725181029809568222939488346 Best, same: (b) 766259462624453036391004127445265907255673543052767246 Worst, better: (A) 95786086615602871731981918389383308042547593781186608 Average, better: 598641118622240495226248575181295257452392055734872086.5 Note that the Total, Worst, and Average fitness of the population all improved as an expected result of natural selection. The Best fitness, meanwhile stays the same--and it has no chance for improvement until running the next stage of the GA... *Perform genetic crossover operation on gen0* Start with 2 parents. Let's use 3 & 4 as an example with an arbitrary crossover point (we'll say 170 digits in--though this would actually be determined by the randomly generated genetic operation): 3. (b) 766259462624453036391004127445265907255673543052767246 = 00000000000000000000100000000000000010000000000000001000000000000000101111000000000011111110000000001100001100000000100000010000000010000001000000001000000100000000100000 | (crossover point) 01000000001100001100000000111011100000000010111100000000000000000000000000000000000000 4. (A) 957860866156028717319819183893833080425475937811868 = 00000000000000000000000100000000000000111000000000000010100000000000011011000000000001000100000000001100011000000000100000100000000010000010000000011111111100000001111111 | (crossover point) 11000000010000000100000011000000011000001000000000100001100000000011000000000000000000 From this crossover, we get 2 new offspring characters: offspring1 (o1): 0000000000000000 0000100000000000 0000100000000000 0000100000000000 0000101111000000 0000111111100000 0000110000110000 0000100000010000 0000100000010000 0000100000010000 0000100000110000 0001000000010000 0011000000011000 0010000000001000 0110000000001100 0000000000000000 o1, Binary String: 0000000000000000000010000000000000001000000000000000100000000000000010111100000000001111111000000000110000110000000010000001000000001000000100000000100000010000000010000011000000010000000100000011000000011000001000000000100001100000000011000000000000000000 o1, Binary-Decimal conversion: 766259462624453036391004127445265907255673543589892144 offspring2 (o2): 0000000000000000 0000000100000000 0000001110000000 0000001010000000 0000011011000000 0000010001000000 0000110001100000 0000100000100000 0000100000100000 0001111111110000 0001111111010000 0000110000110000 0000111011100000 0000101111000000 0000000000000000 0000000000000000 o2, Binary String: 0000000000000000000000010000000000000011100000000000001010000000000001101100000000000100010000000000110001100000000010000010000000001000001000000001111111110000000111111101000000001100001100000000111011100000000010111100000000000000000000000000000000000000 o2, Binary-Decimal conversion: 95786086615602871731981918389383308042547593244061710 So gen1 looks like: 1,2 are the same as before; 3,4 are the offspring of crossover: 1. (b) 766259462624453036391004127445265907255673543052767246 2. (b) 766259462624453036391004127445265907255673543052767246 3. (o1) 766259462624453036391004127445265907255673543589892144 4. (o2) 95786086615602871731981918389383308042547593244061710 *Convert gen1 offspring to valid characters* Now, 3 and 4 are new shapes that do not correspond with any of our 4 letters. The BBC (Bitmap-to-Binary Converter) will be used to "read" these bitmaps and find the closest letter in the space of possible letters which we will define. So, the offspring will turn into something like "h" and "8", because those are the closest possibilities within the alphabet. I was going to include a range in decimal numbers and apply that to the nearest alphabet character within that range, but I don't think that will work because more often than not the offspring will just end up reverting to one of the two characters from which it was created. From this perspective, the BBC may be a very useful portion of the AS program. So, o1 become "h" and o2 becomes "8", converting gen1 characters to: o1=h: 0000000000000000 0000110000000000 0000110000000000 0000110000000000 0000110111000000 0000111111100000 0000111000110000 0000110000110000 0000110000110000 0000110000110000 0000110000110000 0000110000110000 0000110000110000 0000110000110000 0000000000000000 0000000000000000 h, Binary String: 0000000000000000000011000000000000001100000000000000110000000000000011011100000000001111111000000000111000110000000011000011000000001100001100000000110000110000000011000011000000001100001100000000110000110000000011000011000000000000000000000000000000000000 h, Binary-Decimal conversion: 1149389193936678235951120191876555791760811195370319884 o2=8: 0000001111000000 0000011111100000 0000110000110000 0000100000010000 0000100000010000 0000110000110000 0000011111000000 0000011111100000 0000110000110000 0000100000010000 0000100000010000 0000100000010000 0000110000110000 0000011111100000 0000001111000000 0000000000000000 8, Binary String: 0000001111000000000001111110000000001100001100000000100000010000000010000001000000001100001100000000011111000000000001111110000000001100001100000000100000010000000010000001000000001000000100000000110000110000000001111110000000000011110000000000000000000000 8, Binary-Decimal conversion: 23539885800661303801528815919744719668529798326595472592908 So the gen1 now looks like: 1. (b) 766259462624453036391004127445265907255673543052767246 2. (b) 766259462624453036391004127445265907255673543052767246 3. (h) 1149389193936678235951120191876555791760811195370319884 4. (8) 23539885800661303801528815919744719668529798326595472592908 From which we evaluate the fitness again ("8" clearly will be the new dominate gene) and so on, generation after generation... *Conclusion* This is the basic gist of Alphabet Soup by GA. The final aspect of using GA is to incorporate mutations: the random changing of various pixels within the letters, which will obviously introduce the same effect of crossover in adding to the population of letters. Mutations will only be applied to a small portion of the entire population in each generation, just like in nature. With only 4 characters to choose from, let's say there are no mutations in this particular generation. That gives us the current pool of characters, including the initial population, gen0 and the current generation (gen1): [POPinit][gen0] [gen1] [A,B,a,b][b,b,b,A][b,b,h,8] So, we have 6 "b"s, 2 "A"s, 1 "a", 1 "h", 1 "B" and 1 "8" floating around. Granted this is an example run, but it plays out pretty realistically: note that the dominate gene (b) is taking up quite a bit of the population, while a and B have not grown at all in population. "h" and "8" are the new kids on the block, and they will have a direct impact on the future generations, again based on their fitness for reproduction and crossover--with a little bit of mutation thrown in here and there. One of the core questions remains: How do we make the shapes of the characters matter? In other words, how do we specify the importance of various pixels? This, I believe, can also be accomplished through the BBC, and I have some ideas for that. But more on that later.