(The following is a synopsis of chapters 5 and 6 of John R. Koza’s “Genetic Programming”)
OVERVIEW:
Genetic Programming is derived from Genetic Algorithms, and in many ways involves the same process of evolutionary natural selection. The core difference between Genetic Algorithms and Genetic Programming is the program structure. Genetic Algorithms (GAs) operate on fixed length binary strings representing a biological chromosome, while Genetic Programming (GP) instead operates on dynamic hierarchical programs. While in GA, the size and shape of a binary string are fixed, GP programs vary in size and shape. This makes GP programs more complex and, in many ways, more powerful.
The more complex and dynamic program structure of GP results in a cascade of other differences which separate it from GA. Because of this, some problems may be better suited to one method or the other. Generally the more complex the problem or the larger the search space for a solution, the more likely the problem should be solved by GP.
GP solves problems by breeding computer programs, then using reproduction and crossover in iterations through generations of programs, until a solution to the problem is found or some other set parameter is satisfied (such as number of generations run).

Koza’s GP, Figure 5.1
Koza references Holland’s (1975) list of key features of all adaptive systems. For GP to operate successfully, the following must be defined:
· Initial Structures
· Fitness Measures
· Genetic Processes
· Program Memory
· Termination Criterion
· Designating a result
INITIAL STRUCTURES:
GP programs are best represented by expression trees, which are hierarchies containing functions and terminals.

The above diagram represents the program string (in pseudocode):
if{x<4}then{t}else{r}}
Any of the functions if, <, then, else or terminals x, 4, t, r will be subject to change as GP progresses through generations.
It is important to select a sufficient set of primitive functions for a problem. For instance, with an ant trying to maneuver through a maze for food, MOVE, RIGHT, and LEFT are sufficient. Anything less is insufficient, while anything more may or may not be useful.
Keep in mind the set of functions and terminals must be sufficient, but extraneous functions and terminals will often degrade performance.
After the sufficient pool of primitive functions has been determined, the initial population (generation 0) is generated with expression trees containing random combinations of these functions and terminals.
A uniform random probability distribution is responsible for arranging the the expression trees within the initial population, using only the sufficient function set determined ahead of time.
There are three types of expression trees (pp. 92-93):
· Full (every branch of the tree is equal length)
· Grow (variably shaped trees bound only by maximum branch length)
· Ramped half-and-half (combination of full and grow)
Koza prefers to use the ramped half-and-half method for all of his GP demonstrations.
FITNESS MEASURES:
Defining fitness is important because fitness directly affects the primary GP operations of reproduction and crossover. There are many ways to accomplish this.
“The most common approach to measuring fitness is to create an explicit fitness measure for each individual in the population… Each individual in a population is assigned a scalar fitness value by means of some well-defined explicit evaluative procedure” (p. 94).
Koza describes four kinds of explicit fitness measures used in GP programs:
1. Raw Fitness (The “measurement of fitness that is stated in the natural terminology of the problem itself” [p. 95].)
2. Standardized Fitness (The lower the number the better. If the raw fitness is also striving for the lowest number, standardized = raw. Otherwise standardized is the inverse of raw.)
3. Adjusted Fitness (Computed from the standard fitness, adjusted fitness is always between 0 and 1. The bigger the adjusted fitness, the more fit the individual.)
4. Normalized Fitness (Computed from adjusted fitness, the sum of normalized fitness values equals 1.)
GENETIC PROCESSES:
The genetic processes in GP consist of two primary operations: Reproduction and Crossover.
Reproduction:
Reproduction in GP is asexual (one parent, one offspring).
“The operation of reproduction consists of two steps. First, a single [program] is selected from the population according to some selection method based on fitness. Second, the selected individual is copied, without alteration, from the current population into the new population (i.e., the new generation)” (p. 99).
Three methods of reproduction include:
· Fitness-proportionate selection (The most popular method–uses the fitness measure to determine how many of which programs are allowed to reproduce)
· Tournament selection
· Rank selection
Crossover:
Crossover in GP is sexual recombination (two parents, two offspring). It is, essentially, a swapping of subtrees.
Which parents are chosen for crossover is determined by the fitness-based selection method.
Random crossover points are chosen (again determined by using a uniform probability distribution) from each parent. Everything in the subtree below the crossover point is swapped with the subtree underneath the crossover point in the other parent.
Swapping of simple terminals (instead of entire subtrees) is common in GP, and this is the same as point mutation in GA–so mutation is inherent in GP crossover.
Another Crossover difference between GA and GP: In GP the two crossover points in the two parents are usually different; in GA the two points are always identical because crossover is operating on a fixed-length character string. This leads to better capacity for variety in GP:
“In genetic programming, when an individual incestuously mates with itself (or copies of itself), the two resulting offspring will, in general, be different (except in the relatively infrequent case when the crossover points are the same). As before, the Darwinian reproduction operation creates a tendency toward convergence; however, in genetic programming, the crossover operation exerts a counterbalancing pressure away from convergence. Thus, convergence of the population is unlikely in genetic programming” (p. 104).
Also important for GP crossover is the maximum permissible size, which is measured by the depth of the tree. Regulating the maximum depth keeps crossover from creating oversized trees and saves computing time.
There are also secondary operations in GP, all of which are optional and are probably best reserved for specific situations:
· Mutation (Asexual changing of an individual terminal, but occurs occasionally through sexual crossover in GP anyway. Useful in moderation in GAs, but really not necessary in most GP applications.)
· Permutation (Asexual inversion–swapping terminals of a function.)
· Editing (Follows editing rules such as shortening “(NOT ( NOT X))” to “X.” This can be used for cosmetic purposes and for simplification or improvement of GP output.)
· Encapsulation (Protection from crossover for ideal offspring subtrees, encapsulation offers “a means for automatically identifying a potentially useful subtree and giving it a name so that it can be referenced and used later” [p. 110].)
· Decimation (Deleting a portion of population, controlled by two parameters [p. 112]: “a percentage and a condition specifying when the operation is to be invoked.”)
MEMORY:
It is only necessary for the current generation to be maintained in memory for GP operations. Although the individuals in a population represent only a small set of possibilities in the realm of all possible programs, the genetic processes ensure the search will keep moving in the right direction–toward a solution.
TERMINATION CRITERION:
This is not nature and the computer will not run the program forever, so a termination criterion must also be set. This may be a set number of generations or a 100% exact solution to a problem, which, when satisfied, ends the program.
DESIGNATING A RESULT:
Results come in different forms. There may be a 100% exact solution, but maybe there is no single best solution. If the program is terminated at a set number of generations, then the best-so-far individual is usually chosen. The fitness parameters are used to determine what is the best-so-far individual, what is the 100% exact solution to the problem, or what percentage of correctness is close enough to suffice.