Genetic algorithms (GA's) are search algorithms that work via the process of natural selection. They begin with a sample set of potential solutions which then evolves toward a set of more optimal solutions. Within the sample set, solutions that are poor tend to die out while better solutions mate and propagate their advantageous traits, thus introducing more solutions into the set that boast greater potential (the total set size remains constant; for each new solution added, an old one is removed). A little random mutation helps guarantee that a set won't stagnate and simply fill up with numerous copies of the same solution.
In general, genetic algorithms tend to work better than traditional optimization algorithms because they're less likely to be led astray by local optima. This is because they don't make use of single-point transition rules to move from one single instance in the solution space to another. Instead, GA's take advantage of an entire set of solutions spread throughout the solution space, all of which are experimenting upon many potential optima.
However, in order for genetic algorithms to work effectively, a few criteria must be met:
It must be relatively easy to evaluate how "good" a potential solution is relative to other potential solutions.
It must be possible to break a potential solution into discrete parts that can vary independently. These parts become the "genes" in the genetic algorithm.
Finally, genetic algorithms are best suited for situations where a "good" answer will suffice, even if it's not the absolute best answer.
The basic operations of the genetic algorithm are simple and straight-forward:
- Reproduction: The act of making a copy of a potential solution:
- Crossover: The act of swapping gene values between two potential solutions, simulating the "mating" of the two solutions.
- Mutation: The act of randomly altering the value of a gene in a potential solution.
As mentioned earlier, it's necessary to be able to evaluate how "good" a potential solution is relative to other potential solutions. The "fitness function" is responsible for performing this evaluation and returning a positive integer number, or "fitness value", that reflects how optimal the solution is: the higher the number, the better the solution.
The fitness values are then used in a process of natural selection to choose which potential solutions will continue on to the next generation, and which will die out. It should be noted, however, that natural selection process does not merely choose the top x number of solutions; the solutions are instead chosen statistically such that it is more likely that a solution with a higher fitness value will be chosen, but it is not guaranteed. This tends to correspond to the natural world.
A common metaphor for the selection process is that of a large roulette wheel. Remembering that fitness values are positive integers, imagine that each potential solution gets a number of slots on the wheel equal to its fitness value. Then the wheel is spun and the solution on which it stops is selected. Statistically speaking, solutions with a higher fitness value will have a greater chance of being selected since they occupy more slots on the wheel, but even solutions with just a single slot still have a chance.
The above metaphor can also be useful for illustrating an additional point: relative fitness is measured as a percentage, not an absolute value. As a result, a solution with a fitness value of 255 has only about 0.4% chance of being selected over a solution with a fitness of 254. At higher fitness values, this problem is even more exaggerated. In order to keep your evolving solutions from stagnating at some point, it's important to implement your fitness function in such a way as to provide reasonable relative differences in solution fitness values at higher levels. A future version of JGAP may include an "auto-redistribution" feature to automatically exaggerate differences in near-optimal solutions, but for now a well-designed fitness function is required in these cases.
To learn more about genetic algorithms, we highly recommend the book Genetic Algorithms in Search, Optimization, and Machine Learning by David Goldberg. To learn more about using JGAP, please see the tutorial: Getting Started with JGAP.
An external blog entry discusses solving a problem that is easy to understand but quite hard to solve. Source code given as well. A good read!
Copyright © 2002- 2015 Dr. Klaus Meffert / Neil Rotstan. All rights reserved.