Talk:Genetic algorithm

From Wikipedia, the free encyclopedia

This article is within the scope of the following WikiProjects:

WikiProject Computer science (Rated B-Class)
	This article is within the scope of WikiProject Computer science, which aims to create a comprehensive computer science reference for Wikipedia. Visit the project page for more information and to join in on related discussions.
B	rated as B-Class on the assessment scale
High	rated as high-importance on the assessment scale

WikiProject Systems (Rated Start-Class)

Systems science Portal

This article is within the scope of WikiProject Systems, which collaborates on articles related to Systems science.

Systems rating:

Start Class

Mid importance

Field: Cybernetics

Please update this rating as the article progresses, or if the rating is inaccurate. Please also add comments to suggest improvements to the article.

[edit] Featured Article

I'd like to officially nominate this article for consideration as a featured article.
Tyler 19:12, 17 May 2007 (UTC)

[edit] Strong AI question

Doesn't Hubert Dreyfus include genetic algoritms in his 'Strong AI' category, together with neural nets?

Have his 'anti-Strong AI' fellow-travellers John Searle or Roger Penrose ever commented on this?

ericross

[edit] Sentence that doesn't make sense

I was reading this sentence and I don't think it makes sense. "Genetic programming algorithms typically require running time that is orders of magnitude greater than that for genetic algorithms, but they may be suitable for problems that are intractable with genetic algorithms."

Shouldn't the two instances of "genetic algorithms" (one immediately before the comma and the last one) be "non-genetic algorithms"?

I have a feeling that I've somehow misunderstood the topic.. Neoncow 20:53, 5 Oct 2004 (UTC)

"Genetic programming" (GP) is different from "genetic algorithms" (GA) due to historical nomenclature. The sentence is supposed be comparing GPs with GAs, but it's very ambiguous as written. I'll rewrite it. --Lexor|Talk 07:03, 6 Oct 2004 (UTC)

Could this bit of jargon be merged here? Building block hypothesis--nixie 00:50, 23 Mar 2005 (UTC)

The building block hypothesis is not accepted by all people in the field, and is best to remain in its own article. A reference to some of the theoretical hypotheses and observations could possibly be put here (there are some getting lost in the text), only having the BB hypothesis would not be NPOV --Anthony Liekens 02:51, 25 August 2005 (UTC)

[edit] GAs and Biological Realism

I have a big problem with the following line:

"A problem that seems to be overlooked by GA-algorithms thus far is that the natural evolution maximizes mean fitness rather than the fitness of the individual (the criterion function used in most applications)."

It is the consensus of the biological community that natural evolution acts to maximize individual fitness, even to the deteriment of the group (mean fitness). [for non-professional reading see Richard Dawkins book 'The Selfish Gene']. There has been considerable debate in the past on this issue, and it has been settled in favor of individual fitness. —Preceding unsigned comment added by SteelSoul (talk • contribs) 03:21, 13 October 2007 (UTC)

That got deleted, and I comment below under "Mean Fitness...". The paragraph doesn't cite any references, so I'm fine with deleting it, but again, this article is about computer science and an optimization technique, not about population biology. A GA can have arbitrarily many sexes; a chromosome can be an SQL database or an OO hierarchy, instead of neucleotides; an individual can be immortal; nerds can date prom queens. It's not biology, it's inspired by biology. Pete St.John 17:21, 15 October 2007 (UTC)

I agree with you that the nature of this article is largely irrelevant to population biology. My only problem with the quote was that it was referring erroneously to traits of natural evolution. If the statement read something like "One significant problem facing the field of GAs is that they maximize individual fitness, when it is generally desired that mean fitness is maximized.", then I would have no problem with it (I'd probably even agree). --SteelSoul 05:00, 16 October 2007 (UTC) —Preceding unsigned comment added by SteelSoul (talk • contribs)

[edit] Some problems

This article was a very good overview of GAs. But I thought this sentence was a bit misleading:

The pool is sorted, with those having better fitness (representing better solutions to the problem) ranked at the top.

Although fitness numbers serve to rank individuals, actually sorting them is a very expensive process, rarely done in practice. Roulette and tournament selection were both designed to choose higher ranked individuals stochastically, without the need for explicit sorting. --Bytefield 15:06, 23 July 2005 (UTC)

I'm sorry, but you're completely wrong: while the Roulette and the Tournament indeed are faster than a complete sort, the sort of even thousands of individuals takes microseconds these days, i.e. is completely irrelevant w.r.t. the computation time for the genetic operators and the evaluation function. The real reason for the Roulette or the Tournament is not their speed, but the fact that they do not "sort" the individuals completely: they leave a small chance for underperforming individuals to end up near the top. This is crucial in avoiding premature convergence, as it gives a chance to reproduce even to individuals that seem no good. Why? Well because we do not know whether they are really "no good" (remember, these are combinatorial problems!). -- EF (efalkena@ulb.ac.be)

GAs can rapidly locate good solutions, even for difficult search spaces.

This observation is overly optimistic and general. It should be more carefully worded or more specific. It would be interesting to give an example of a "difficult search space" where GA performs well and alternative methods fail.

Agree completely: GAs are by no means guaranteed to "rapidly locate" anything, even in simple search spaces.
The point is, a GA's encoding and operators must be adapted to the search space (i.e. the cost function) it works with to function properly. It must exploit any inherent structure the cost function has. Failing that, the GA will fail lamentably - but succeding that, it will probably beat any other method on that function. --EF (efalkena@ulb.ac.be)

I think this sentence should be reconsidered:

Unless the fitness function is handled properly, GAs may have a tendency to converge towards local optima rather than the global optimum of the problem.

I wouldn't say the problem of premature convergence is really due to improper 'handling' of the fitness function (what does 'handled properly' mean anyway?). Perhaps someone with more expertise in this subject could elaborate further, in my experience its the genetic operations that can 'handle' this problem and ensure diverse search across the search space.

To go further, one could mention that premature convergence to a local minima is less an issue in GA than it is in other approaches (gradient search, simulating annealing, etc). -- RK

I think I've corrected the first and third problems. I agree with the second one mentioned, but didn't know what to replace it with. — Asbestos | Talk (RFC) 17:12, 23 January 2006 (UTC)

[edit] Related techniques

The section on Tabu states that it "moves to the solution with the lowest fitness of those generated". This really sounds wrong and I think it should be "removes the solution with the lowest fitness". I am not an expert but "The enhanced evolutionary tabu search and its application to the quadratic assignment problem" by John F. McLoughlin, III and Walter Cedeño seems to imply this, as well as "The niching method for obtaining global optima and local optima in multimodal functions" by Masako Himeno and Ryutaro Himeno. The both state that the lowest fitness solution is replaced by a new one.

NOTE: I see that User:Diroth changed "lowest fitness" to "lowest energy" here. This sounds more logical. --Jdz 01:31, 25 April 2007 (UTC)

Another problem is that this text is repeated verbatim across several articles, including Ant colony optimization. I think there needs to be a single "List of localized optimization techniques" referenced from GA, AOC, etc. --Jdz 22:14, 21 February 2006 (UTC)

[edit] Building block hypothesis

This section may need attention. The quote appears to suggest that crossover is used in genetic algorithms to generate variation.

My understanding is that the function of crossover (sexual reproduction) in nature is to cross in sections of genetic code with mutations resulting in advantageous variations and cross out corresponding disadvantageous variations. By recombining two copies of a gene via crossover, the genetic carrier is made more robust to random variation which might result in corrupt code (since both copies would need to be corrupt in the same place to completely corrupt the genetic code). Thus crossover is in fact a way of limiting the effect of variation through mutation and collecting random variations which have increased fitness into a single sequence of genetic code, not a means of generating variation in that code.

I added a link here to Holland's Schema Theorem. I think the author's meaning of this section is to give a hypothesis for why GAs work. The GA uses 'building blocks' or schemas (high fitness but undefined parts of a solution) to 'build' the optimal solution. These schemas then become more frequent in each subsequent generation. The article gives the formula. To make it more complete though, you'd have to dive into literature to find out what is written on the matter, starting with Holland and Goldberg, I guess. 145.53.202.88 11:13, 16 November 2006 (UTC)

[edit] History of genetic algorithms

The history given here credits Holland for introducing genetic algorithms. This is a common misunderstanding. Holland certainly widely publicized these methods among computer scientists, and he did introduce his Schema Theorem. His influence was great. But genetic simulations of evolution had been going on since Barricelli's work of 1954 and Alex Fraser's work of 1957. There were others in that period too: see David Fogel's collection of reprinted papers: 'Evolutionary Computation: The Fossil Record".

In fact by the early 1960's there were many papers in the population genetics literature using simulated evolution, methods that cannot be distinguished from genetic algorithms (they have genotypes, fitnesses, mutation, and recombination). There were even two whole books on how to do genetic algorithms published well before Holland's in 1975:

Fraser, A. S. and D. Burnell. 1970. Computer Models in Genetics. McGraw-Hill, New York.
Crosby, J. 1973. Computer Simulation in Genetics. Wiley, New York.

If Holland invented the method, how do we account for its appearance in those books, and the many papers? Even if one restricts the definition of GA to attempts to solve some design problem outside of biology, Nils Aall Barricelli's work did that too.

I would appreciate any reactions. Perhaps I am missing some restriction of the definition of GA that would rule out Fraser's and Barricelli's work (and all the others), and I want to make sure this is acceptable before I modify the Wikipedia entry. -- Joe Felsenstein (whose 1974 paper on evolution of recombination used simulated evolution, and that too is before Holland)

Go ahead, your contribution is welcome. But do read WP:NOR first—if a Wikipedia entry is based on a "common misunderstanding", that is OK. Wikipedia is a tertiary source, and it aims only to collect and reproduce viewpoints, not correct them. You would need another source than you own expertise for the purposes of verfiability. (See also WP:V). And why not get an account, that makes it easier to talk to you. Best, Arbor 10:46, 25 April 2006 (UTC) Later addition: I think Fogel's work certainly qualifies as the type of source I was talking about. Carry on. Arbor 10:47, 25 April 2006 (UTC)

It is widely cited in academic publications that Holland takes the credit for introducing GA. I guess no matter it is misunderstanding or not, wikipedia should not dispute with academic publications on this topic. Cannot be bothered to attach any reference as in almost every single GA academic publication - books or journal articles, Holland is specifically mentioned for his introduction of GA. - Yin

Holland is no doubt the person most responsible for publicizing GAs. He did not invent simulation of evolution: Barricelli, Fraser and others did. As I wrote, there were two books already in print before Holland's book. Just because so many academic publications get it wrong is no reason for Wikipedia to get it wrong too. Felsenst 14:44, 19 June 2007 (UTC)

The latest editing of this section says that "artificial evolution became a widely recognized optimization method as a result of the work of Ingo Rechenberg in the 1960s and early 1970s - his group was able to solve complex engineering problems through evolution strategies (1971 PhD thesis and resulting 1973 book)." This credits Rechenberg not only with pioneering work on artificial evolution, but specifically says that the wide recognition of the methods is due to his work. What is the evidence that subsequent work cited or was aware of Rechenberg's work? Holland's work was widely cited -- was Rechenberg's? Felsenst 14:50, 19 June 2007 (UTC)

My theory about this dispute is that what first was modelling of biological processes by computer (done by biologists) gradually became using the biological allegory to develop new optimization processes. A computer simulating evolution can be doing about the same thing but with two different goals: to study actual biology, or to omptimize an engineering problem. One is part of the science of biology, the other is part of computer science (applied mathematics, artificial intelligence, control engineering..). Holland was not the first to model biological processes, but he was very early in using similar techniques as part of computer science itself, instead of as a laboratory tool in biology. I don't blame the biologists who pioneeered computer simulations for wanting a share of the credit for GA, but Holland deserves alot of credit for the development of the CS discipline, especially considering how little RAM those guys had in the 60's :-) Pete St.John 16:58, 22 August 2007 (UTC)

I actually agree with this -- there ought to be some way to write the history section to say that biologists such as Barricelli and Fraser pioneered the use the computers to simulate evolution, in the process figuring out how to carry out operations representing mutation, recombination, genetic drift, and natural and artificial selection. Then others, with Holland playing a large role as early and highly visible, used essentially similar methods to solve optimization problems (I think David Fogel's book gives a couple of examples of other early people doing optimization, but Holland's book seems to have caused the Big Bang in using GA for optimization). My insertions of the biologists into the History section was a reaction to the widespread misunderstanding that gives all credit for developing GAs to Holland, as if he were the first person to figure out how to do selection and recombination in a computer. Once I even had a biologist who had been reading about GA come to me and say he had this wonderful idea -- why didn't we use them to model evolution? I had to gently tell him he was 30 years late. BTW I do suspect that Rechenberg is not the person whose work caused artificial evolution to be "widely recognized". Felsenst 05:55, 29 August 2007 (UTC)

I think there is big misunderstanding on why Holland is getting credit. The work he done is formalizing first schema theorem, thus he open the big door for use of GA/EA since one could mathematically show that solutions will converge to "some kind of optima", or at least that fitness of children will increase, until some point, with some probability p. In my opinion this is very important step, thus he deserves big credit for it. Who has first with this idea is not of big importance since they theory had no prove or scientific bases, just hunch that since we have involved one could use this for solving problems. Never they shouldn't by forgotten. GA Fantastic (talk) 08:51, 7 December 2007 (UTC)

"The history given here credits Holland for introducing genetic algorithms. This is a common misunderstanding. Holland certainly widely publicized these methods among computer scientists, and he did introduce his Schema Theorem. His influence was great. But genetic simulations of evolution had been going on since Barricelli's work of 1954 and Alex Fraser's work of 1957. There were others in that period too: see David Fogel's collection of reprinted papers: 'Evolutionary Computation: The Fossil Record"."

bla bla bla

Holland is the grandaddy of GA. You mention books written in the 60's what about david e goldbergs 1953 - ga's in search, optimization and machine learning in which he gives credit to his supervisor john holland. If you have not read that book you should not be commenting on this article. Yin should put his/her google skills in to use somewhere else. —Preceding unsigned comment added by 77.99.241.77 (talk) 14:42, 3 February 2008 (UTC)

I am not Yin but it was I who wrote that quote. Certainly if David Goldberg had written a book in 1953 discussing genetic algorithms, and if this book was a result of a Ph.D. thesis with John Holland, it would mean that the material I added to the page about Barricelli and Fraser would be wrong in crediting them with being pioneers. My (not Yin's) lack of Google skills would be exposed and I should remove the material I added, apologize, and retire from commenting on the GA page. However ... Goldberg's book seems to have been published in 1989. His Ph.D. thesis was actually entitled "Computer-aided gas pipeline operation using genetic algorithms and rule learning." His Ph.D. was granted in 1983, not 1953. Had he been working on it for 30 years? Maybe it was his second Ph.D. with Holland, the earlier having been taken 30 years before? Let's see some evidence. I do agree that Holland was the father of GA, not in having been first but in making these techniques widely known. But he was not first to introduce these methods. Felsenst (talk) 16:01, 3 February 2008 (UTC)

his book was edited in 1989 it was first published in 1953, it's sat infront of me do you want a picture?

Absolutely. Using my (limited) Google skills I find that in 1953 John Holland was 24 years old, with his first publication on GAs not to come until 1962, and David Goldberg would not receive his B.S.E. undergraduate degree until 1975. Our University library's listing do not show him as having a 1953 book, but they do show him as being born in 1953. Most U.S. universities did not have a computer in 1953. Are you sure you're not mixing up his birth date with the date of publication? You can find my email address on my web page which is linked to at my Wikipedia author page. Felsenst (talk) 20:19, 3 February 2008 (UTC)

Yeah citing "Goldberg 53" would be ludicrous. However, the distinction I fail to make convincingly (but I'll keep trying) is between using the biological allegory as an optimization technique vs using a computer to model a biological process. Yes biologists did the later, earlier. That's not the point. Holland was the first (known to me) to do the former. Specifics to the contrary are welcome, if I've missed something. Pete St.John (talk) 21:49, 5 February 2008 (UTC)

I can see the point, and that would argue for greatly reducing the size of the mention of geneticists' work introducing genetic simulations before Holland. However Holland was not the first to use genetic algorithms to solve nonbiological optimization problems. In fact, Barricelli's initial paper had little organisms solving a nonbiological problem. There were others soon after who did this as well -- Fogel in his book "Evolutionary computation: the Fossil Record" reprints these papers (I need to locate my copy). The two efforts are somewhat entangled with each other. Holland certainly should get credit for being the person who publicized the use of GAs for optimization (in the current page Rechenberg is given credit for this), He also introduced the Schema Theorem (although it is often misunderstood as proving that fitness will increase, which it doesn't). Felsenst (talk) 16:16, 9 February 2008 (UTC)

"somewhat entangled" I can buy :-) I just recently stumbled on a lab in Spain where biologists are applying biology to computer science, very interesting. As long as we seek to clarify the entanglement, instead of turf-fighting over precedence, I'm happy for inclusion of (entangled, relevant) genetics material. I'll be interested in the citation to the Fogel and Barricelli's stuff, mabye there's something online somewhere? Thanks, Pete St.John (talk) 20:18, 11 February 2008 (UTC)

David Fogel's 1998 book "Evolutionary Computation: The Fossil Record" is a wonderful collection of many of the earliest papers using genetic algorithms. It includes a number of people who used them very early to solve nonbiological optimization problems. My copy is misplaced but there is available a Table of Contents of the book, which should enable you to track down most of the papers. It is here as Excerpt 3. I can also be contacted by email (my address is on my web site, linked to on my user page). Felsenst (talk) 18:23, 12 February 2008 (UTC)

I'd prefer that people out to establish use of genetic algorithms as an optimization technique prior to Holland, buy expensive books and research their case :-) One challenge of historical analysis is that old books aren't digital and references often aren't digital. I'm not on a campus, so much as I'd be interested by the Barricelli paper (for example) I won't likely dig it up myself. Pete St.John (talk) 18:53, 12 February 2008 (UTC)

That said, the dual appointment you have is very cool; plainly you are in a great position to address the entanglement. But you have graduate students! Put them to work. I believe that even math students should be able to write. Granted biology grad students aren't quite that calibre ;-) Pete St.John (talk) 18:57, 12 February 2008 (UTC)

[edit] Observation removed from the main page

GAs cannot effectively solve problems in which there is no way to judge the fitness of an answer other than right/wrong, as there is no way to converge on the solution. These problems are often described as having "needle in a haystack" fitness landscapes.

"A needle in a haystack" problem which is referred here is hard to optimize by any method. Binary fitness values are not enough for a problem to be "a needle in a haystack", in addition it should have only one maximum (a needle). It is easy to think of many cases with binary fitness values that GA can optimize well, simplest ones are line patterns on a 2d greed. Anyways, the observation is not correct as stated.

You're right that the NIAH problems usually refer to those with only one solution, but I think the binary fitness part of the problem is significant. Or, rather, the fitness doesn't need to be binary, but there must be a sharp distinction between correct and incorrect, as this is what makes the optimization difficult (as the GA has no slope which it can climb). Anyway, the point of the statement is to say that "problems with binary fitnesses are difficult," not that "NIAH problems are defined as those problems with binary fitness." I think that this is pretty much agreed upon for basic GAs, as GAs need slopes to climb. What exactly do you mean by the "line patterns on a 2d greed" that are easy to optimize? — Asbestos | Talk (RFC) 17:59, 17 July 2006 (UTC)

I've replaced the sentence, but removed the irrelevent needle-in-a-haystack statement. I've added that a random search can typically find such solutions just as fast as a GA in these situations, as a GA is essentially acting as a large random parallel search algorithm. — Asbestos | Talk (RFC) 13:30, 19 July 2006 (UTC)

[edit] Local search ?!

Specifically it falls into the category of local search techniques and is therefore generally an incomplete search.

Um, sure, GAs don't always find the global optimum, but what method does? I can't think much more global method than GA which uses the whole search space as opposed to true local methods which search only the neighborhoods of the given solution candidate. Consider a hill climber local algorithm for example, it halts when it reaches a local optimum. GAs may stall in such position, but in theory they should eventually escape from it.--JyriL ^talk 06:38, 2 August 2006 (UTC)

A method that always finds the global optimum is one that searches every possibility, which a GA doesn't do. I assume that this is the distinction implied by the difference between a global search and a local search. — Asbestos | Talk (RFC) 14:52, 3 August 2006 (UTC)

No, that's the different between complete(?) and incomplete search methods (or do you mean the difference between deterministic and stocastic methods?). Most global optimization methods, including GAs, belong to the latter as the former is usually impossible to implement.--JyriL ^talk 16:37, 3 August 2006 (UTC)

Sorry — I wasn't clear. I wasn't saying that a GA was a local searching algorithm, rather that this was probably what the original author of the sentence had in mind (plus I was answering the "what method does?" rhetorical question). I agree that a GA is not a local searching algorithm. For the record, complete searches are not impossible to implement, in fact they are many times easier to program than a GA (any problem that is solvable by a GA can be solved by a brute-force search). The problem is that they take too long compared to any stochastic method. Anyway: yes, the offending sentence is incorrect, and should go (though the part about it being incomplete is correct). — Asbestos | Talk (RFC) 18:25, 3 August 2006 (UTC)

OK, I was afraid I've made a grand misunderstanding on the issue. ;) "Implement" was wrong word, I meant exactly what you said.--JyriL ^talk 18:30, 3 August 2006 (UTC)

Can someone write the article in simple words please.

How do you mean that please, be more specific ? Tulkolahten 12:08, 16 November 2006 (UTC)

[edit] See Also

Evolutionary Strategies are mentioned in the Related Techniques section. The See Also section is currently redundant and should be removed.Keki Burjorjee 07:36, 30 January 2007 (UTC)

[edit] References

The references section has gotten quite long. Many of the references describe applications of GAs or boutique genetic algorithms and aren't cited in the body of the article. I suggest removing the following ones.

   * Fentress, Sam W (2005), Exaptation as a means of evolving complex solutions, MA Thesis, University of Edinburgh. (pdf)
   * Harvey, Inman (1992), Species Adaptation Genetic Algorithms: A basis for a continuing SAGA, in 'Toward a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life', F.J. Varela and P. Bourgine (eds.), MIT Press/Bradford Books, Cambridge, MA, pp. 346-354.
   * Hicks, C., (2006), A Genetic Algorithm tool for optimising cellular or functional layout in the capital goods industry., International Journal of Production Economics, 104(2), 598-614.
   * Kjellström, G. Network Optimization by Random Variation of component values. Ericsson Technics, vol. 25, no. 3, pp. 133-151, 1969.
   * Kjellström, G. Optimization of electrical Networks with respect to Tolerance Costs. Ericsson Technics, no. 3, pp. 157-175, 1970.
   * Kjellström, G. & Taxén, L. Stochastic Optimization in System Design. IEEE Trans. on Circ. and Syst., vol. CAS-28, no. 7, July 1981.
   * Kjellström, G. On the Efficiency of Gaussian Adaptation. Journal of Optimization Theory and Applications, vol. 71, no. 3, Dec. 1991.
   * Kjellström, G. & Taxén, L. Gaussian Adaptation, an evolution-based efficient global optimizer; Computational and Applied Mathematics, In, C. Brezinski & U. Kulish (Editors), Elsevier Science Publishers B. V., pp 267-276, 1992.
   * Kjellström, G. Evolution as a statistical optimization algorithm. Evolutionary Theory 11:105-117 (January, 1996).
   * Kjellström, G. The evolution in the brain. Applied Mathematics and Computation, 98(2-3):293-300, February, 1999.
   * Pongcharoen, P., Hicks, C., Braiden, P.M. and Stewardson, D., (2002), Determining Optimum Genetic Algorithm Parameters for Scheduling the Manufacture and Assembly of Complex Products. International Journal of Production Economics, 78(3), 311-322.

Jasper53 07:58, 30 January 2007 (UTC)

My intension is to supply an article about Gaussian adaptation that was used already in 1969 for the maximization of manufacturing yield of signal processing systems. Later it turned out that yield is not very far from the mean fitness of populations of living organisms.

Kjells 16:07, 30 January 2007 (UTC)

I do not feel that the Kjellstrom references will be helpful to someone who is seeking to understand what GAs are and how they work. The references might be more appropriate in an article about the history of evolutionary algorithms.

Text about Gaussian adaptation as a genetic algorithm has been added among GA-Variants, and two referneces to Kjellström has been introduced again. But, of cource, the same references may be found in the article about Gaussian adaptation.

Kjells 20:09, 21 March 2007 (UTC)

I've incorporated some of the references as in-line citations. If other people can help to incorporate the remaining references into the text, then the ones that remain can be considered for removal. -- Bovineone 05:41, 24 June 2007 (UTC)

I apologize for not having included a reference to Rechenberg in the paper about Gaussian adaptation. A reference is available in an earlier paper: Kjellström, G. Evolution as a statistical optimization algorithm. But I am not sure that Rechenberg used Gaussian adaptation, because it relies on a certain theorem that was not discovered by Rechenberg, I suppose. In addition, focus is on maximization of mean fitness and average information, not solely the fitness of the individual.--Kjells 06:37, 25 June 2007 (UTC)

If I remember rightly, the probability of success in Rechenbergs algorithm was equal to 0.18 instead of 0.37 as it should be in Gaussian adaptation. The reason is that Rechenberg maximizes the fitness of the individual instead of mean fitness and average information. This follows from the theorem of efficiency as published in Kjellström, G. On the efficiency of Gaussian adaptation. So, I can't see that he used Gaussian adaptation at that time.--Kjells 13:09, 25 June 2007 (UTC)

Some lines about Rechenberg have now been included in the article on Gaussian adaptation. And I don't think that Rechenberg was prior to Kjellström with Gaussian adaptation, and I think that the corresponding line should be removed--Kjells 06:19, 26 June 2007 (UTC)

[edit] GA's now important in statistical experimental design

Dfarrar 22:15, 21 March 2007 (UTC)

Dfarrar, you created a section title but no section? Pete St.John 15:24, 22 March 2007 (UTC)

[edit] Rechenberg and Gaussian adaptation

Gaussian adaptation is about maximization of manufacturing yield and tolerancing. Later this has been referred to as centering and tolerancing. As a model of evolution Gaussian adaptation is said to carry out a simultaneous maximization of mean fitness and average information by the aid of a certain theorem of Gaussian adaptation, not discovered by Rechenberg.

After having searched the litterature in these contexts I can't see that Rechenberg has anything to do with Gaussian adaptation. He has not been referenced in papers about centering and tolerancing. I have therefore removed the lines in the article about Rechenbeg and Gaussian adaptation.

Evolution strategy is included as a variant of a genetic algorithm discovered by Rechenberg.--Kjells 06:32, 27 June 2007 (UTC)

After the latest changes the reference to the first usage of Gaussian adaptation has disappeared. It has now been included again.--Kjells 18:31, 9 July 2007 (UTC)

[edit] Inversion operator

There has been a discussion on the inversion operator on my talk page. I think it would be better to describe this as a rule of genetic reordering variation among other mutations rather than by a single word in the text.--Kjells 09:45, 24 July 2007 (UTC)

[edit] Examples

I think it could be useful to add a link or two to some genetic-algorithm applications or evolution movie (for example: a walker). In this way it should be more clear for users how ga work...

I have added a link (in the section external links) to a tutorial is presented step by step a genetic algorithm. I think it is usefull for people that after reading the theorie would like to programm one. It could be seen as an example done by the reader itself. The link is http://fog.neopages.org/helloworldgeneticalgorithms.php. Swarmcode 12:47, 17 August 2007 (UTC)

yeah and it got removed (once anyway). It's really not so bad, but the English can be improved. Keep working on it, be persistent and be patient :-) My $0.002 worth of advice. Pete St.John 15:59, 17 August 2007 (UTC)

Hey, I agree this site is pretty good, and useful if the English was a little better, but it doesn't meet the criteria of the wikipedia project. Ultimately that is what is important. I would love to link to my homepage and dissertation on GA's, but I'm not about to! That is what a userpage is for, which I'm pleased to see Swarmcode has now got, :D welcome! MattOates (Ulti) 19:23, 17 August 2007 (UTC)

Following the link Swarmcode just gave above, I realize that you have to click on the DNA looking “HelloWorld” image to get to that article. I didn't even figure that out from the original link! My question is: would anyone else notice? Would they see this as a well-referenced & authoritative source, worth being directly linked from the wikipedia article compared to any other website on the net? If the answer to these is honestly no, should it be linked? Sorry if I sound like an ass (I know that I do :[ ) I'm just tired of following external links to find they aren't as informative as I would hope. Quick P.S. Swarmcode, you might like to contribute to Wikiversity with that material, since it is way more in keeping with the content they are responsible for. MattOates (Ulti) 19:44, 17 August 2007 (UTC)

Well, my site is not a well-referenced because that is plenty of other sites out there that are well referenced. It is about practical things... how to implement and although I have looked for some practical tutorials about GA I have not found out there and in this way my site is unique (the user has a code in the end to play with). Check out Nehe and you will see why their tutorials are so easy to understand and learn. They did not have reference, but did he need? If somebody want some reference they could easily find it in a book or papers, or even in the 2 cites cited in the external links of wikipedia. My tutorial and answer here goes to where could someone learn something more practical (in wikipedia for example).. something to run and compile right after the tutorial not only some letters. If you dont wanna link my tutorial that is ok ... but as a user of wikipedia I am very sad about not linking with sites that give Practical things. Not that theorie is not good, but they are not enough.... I think the first guy who posted the question had the same sensation as myself.Swarmcode 00:16, 22 August 2007 (UTC)

What about Genetic Arm? It should be a good example... —Preceding unsigned comment added by 146.133.1.43 (talk) 14:16, 28 December 2007 (UTC)

[edit] Mean fitness, and biology vs computer science

Steelsoul reverted the paragraph "...A problem that seems to be overlooked by GA-algorithms thus far is that the natural evolution maximizes mean fitness rather than the fitness of the individual (the criterion function used in most applications)...", reason given that (paraphrased) "consensus in population biology contradicts". I let the revert stand because the item did not cite any references. However, in general, I hope we remember that this is about computer science, not population biology. We should not draw inferences from one field about the other, without references. This has been a source of confusion in the article, particularly the historical material, where early models of population biology have been confused with early biology-inspired optimization techniques, in arguing over precedence. Pete St.John 17:14, 15 October 2007 (UTC)

[edit] Proportion of individuals is/are

I reverted this sentence: "Most functions are stochastic and designed so that a small proportion of less fit solutions is selected..." to "...are..." (italics mine). First, if we swap back and forth between Americanisms ("are") and Britishisms ("is") then we will wear ourselves out. I advocate tolerating each other's regionalisms, although it will be a good thing when we all converge. Second, however, in this case the Americanism is a bit more precise, because in the algorithm the proportion is a parameter and the individuals (plural) are individually selected. "Selection" refers to the individuals, not the proportion; but the proportion of such individuals (of less fitness) is small. So IMO the original statement is slightly better. Pete St.John 16:13, 22 October 2007 (UTC)

[edit] Applications

I notice that there is a rather long list of applications... I think we should be sure that those are really applications for which genetic algorithms have been researched, and possibly even have shown some degree of success.

For instance, I am a bit surprised at the mention of code-breaking, as it is a typical case where 1) there is no possibility to ascertain the quality of a solution, 2) A "close to optimal" solution is worthless.

The list is slightly too long already, and if we don't shorten it and find references, we might just add to it "finding the answer to life, the universe and everything" Ratfox 08:35, 13 November 2007 (UTC)

I'm gonna revert "hardware bug finding during validation" and post to the contributors talk page asking for a citation. Pete St.John (talk) 17:49, 26 November 2007 (UTC)

[edit] Belongingness to Molecular and Cellular Biology WikiProject.

Hi! I was near chock experience then I saw that this page was added to Molecular and Cellular Biology WikiProject. I don't agree and don't understand why would someone do this? This is the topic, with no doubt, belonging to Computer Science. so it should by removed from Biology Project.

Regards GA Fantastic (talk) 08:56, 7 December 2007 (UTC)

I agree, it shocked me too for the same reason. Part of the explanation may be the claims made by biologists that early computer models of genetic processes were precursors to genetic algorithms; it's a credit dispute (see above). As computer science, I think we pretty much agree that Holland was very early and deserves alot of credit (to me, it's amazing to try and do GA with such tiny amounts of memory) but we can also agree that as a model of a biological process, GA built on work done by biologists (including computer simulations). I don't so much mind including this in a bio project, but we should be in AI and CIS projects also. Pete St.John (talk) 19:49, 7 December 2007 (UTC)

That said, I see the list of categories on the article page is very reasonable. One might add "AI" but "search algorithms" and "evolutionary algorithms" are already subcategories of AI. I'll go drop a note on the cellular bio project page and see if they still care. Pete St.John (talk) 19:57, 7 December 2007 (UTC)

so, I just dropped a (lengthy) note about it at Wikipedia_talk:WikiProject_Molecular_and_Cellular_Biology#Miscategorization_of_Genetic_Algorithms Pete St.John (talk) 20:24, 7 December 2007 (UTC)

I added the Wiki project Computer Science template, so even if people are confused, they will know they are confused :-)

Well, I'm the one who put the MCB template on here way back when I was tagging hundreds of articles. The reason I tagged it was because GAs have their application in various aspects of computational molecular biology. I found other bioinformatics algorithms, e.g. BLAST, were tagged with the MCB template as well. In computational molecular biology, GAs are used for multiple sequence alignments and phylogenetic tree building, for example. Therefore, I felt this article was within the scope of bioinformatics/computational molecular biology. - tameeria (talk) 22:59, 7 December 2007 (UTC)

I like to say that GA is never the best solution, but always an adequate one, so it's good for problems that lack well-developed theoretical frameworks. However that may be, GA has lots of applications. I have used it for Extremal Graph Theory-- and in fact, an algorithm is a tree, which is a kind of graph, so really GA should be in the Graph Theory project :-) The interconnections are diverse, varied, and numerous. If you take a gander at the list of applications you will see many fields; it would not be helpful to insert GA into every such category. I hate to be a "Streamliner" but I vote to drop the MCB category. For example, you can find GA in the course catalogs at CS departments, but not Bio, correct? We all use computer techniques, from Word Processing to Nonlinear Optimization, but those things aren't necessarily parts of our fields of appplication. Now Genetic Computing, that would be tuff to categorize. That, I think, would be both CIS and MCB (well, not cellular but molecular). And using a Genetic Computer to run a Genetic Algorithm to optimize the fit of a Model of Genetics would be...confusing :-) Also I think I should add that GA, which we mostly think of as Nonlinear Optimization (I think of it as AI), predates Bioinformatics, and I don't think particularly pertains to it. I'm sure they use lots of algorithms in Bioinformatics. Pete St.John (talk) 23:30, 7 December 2007 (UTC)

GA is at least mentioned if not covered in some detail in bioinformatics classes taught to biology students. These students may then come here to learn more about e.g. how it works, what it's used for, and how it differs from other search algorithms. Biologists with an interest in sequence alignments, phylogenetics, or structure and pattern analysis may come here. And yes, sure, they use lots of different algorithms and most of them will be on the more or less clueless "user" rather than "programmer" side. The question that might bring a biologist to this page might be whether GA would be an appropriate choice for their search problem and what advantages/disadvantages it might have over the other options. I'm not sure that's clear from this page at all.

Also, Wikiprojects are not exclusive but there can be a whole list of them collaborating for any page so I'm not sure "belongingness" would be an issue. Many articles have a whole list of Wikiprojects collaborating on them. There is no doubt that this subject is part of computer science, but the term "genetic algorithm" also brings up 1372 publications in PubMed indicating its importance for the life sciences as well. - tameeria (talk) 02:47, 8 December 2007 (UTC)

Your are making a god point here, but I would like to propose more correct model. I think it would by handy if pages could by tagged as "is of importance for" or "is of interest for". This would clear all confusions. I think articles should have clear and straight belongingness. Then if they are of interest for particular groups they could by tagged as so. Use of GA can by found in physics, expersystems, physical chemistry, crystallography, math and so on. Also a lot of computer science is of big interest for many different science branches. Adding those to different wiki project wouldn't be helpful, only confusing. Thank you.

GA Fantastic (talk) 07:33, 8 December 2007 (UTC)

There are many interdisciplinary topics on Wikipedia and it's often impossible to sort them into a single category. For example, look at oxygen - that article has tags for three Wikiprojects: Elements, MCB, and Medicine. The tags indicate which Wikiprojects consider oxygen as being of importance to them. I thought the Wikiproject templates were just that: indications of what is of interest for a particular project. Regarding importance, there's actually a rating feature built in. So I've rated this one low importance for MCB because quite frankly I think there will only be a small subgroup of molecular biologists really interested in this. If you want to rate the importance for the Computer Science project as high, I think that would help solve the confusion about for which project the topic is more important. I've rated it's class as B for the MCB project. It's a good article, but it is obvious is was written neither by nor for biologists, so it could be improved with some information putting it into the context of bioinformatics maybe. Again, your rating may vary if you think the article is a complete coverage of the topic from the computer science side. - tameeria (talk) 15:26, 8 December 2007 (UTC)

I'm thinking it's mainly the difference between a method of study and an object of study. We all can use GA as a tool in our study, but GA is an object of study in CS. Biology papers would be about using GA to learn something about Cells, while CS papers would be about using Mutation Rate for better GA. Ironically, my own interest is using GA to evolve better GA, so (in this activity) it's both the tool and the object. Pete St.John (talk) 21:05, 11 December 2007 (UTC)

In the meantime the article had been moved from WP:WPMCB to Wikipedia:WikiProject Mathematical and Computational Biology. I don't agree with this for the same reasons as for MCB. While computational biology uses computational methods to solve biological problems, while GA is a biologically inspired methodology in computer science. Thus, I've removed the banner. Hendrik Fuß (talk) 13:45, 17 January 2008 (UTC)

I would myself just have the one category, but the WPMCB folks seem to like it, I think partly on historical grounds and partly from ongoing crossdevelopment from using GA to model some genetics processes. So I was content to have both categories CS and Computational Bio (which was a better fit than MCB); at worst, it would seem to do no harm. The MCB cat I think could confuse people unfamiliar with the subject. Pete St.John (talk) 19:17, 17 January 2008 (UTC)

[edit] "Belongingness" revisited

I just wanted to point out an interesting reference: Natural Computing Group (at Politecnica de Madrid, but the site is in English) is a group of biologists (sorta) working on applying biology (in more than one way) to Computer Science (as opposed to using computers to model biology; everyone uses computers to model what they do). So for example:

More recently, two clearly biologically-inspired paradigms are being applied very effectively to solve applied problems in business and industry: artificial neural networks and genetic algorithms. However, there is still room for advancement. Instead of developing computing systems inspired by biological processes, biological substrates and biological processes can be used directly to codify, store and handle information, as Leonard Adelman demonstrated with his pioneering experiment on DNA computing.

So I think I was too dismissive of the Molecular-Cellular Bio categorization, but of course the Mathematical-Compuational Bio cat is a better fit. (Funny they both have the acronym MCB.) Pete St.John (talk) 15:46, 31 January 2008 (UTC)

[edit] Exact vs approximate solutions

In a discrete system (such as optimizing graphs with large numbers of nodes) exact solutions can be found with GA. So I'll considering reverting that change just now. Pete St.John (talk) 20:24, 7 December 2007 (UTC)

[edit] Genetic memory deletion

The article genetic memory is currently up for deletion. I've done some literature digging and found that the term is also used in computer science publications in connection with genetic algorithms. So just a heads-up in case anyone wants to comment on the delete discussion. - tameeria (talk) 15:47, 12 December 2007 (UTC)

Addendum: The discussion has been closed and the result was to keep the article with the suggestion to do a major rewrite, so in case anyone from the computer side wants to contribute by adding/rewriting please feel free to do so. - tameeria (talk) 19:06, 12 December 2007 (UTC)

[edit] Pseudo-code algorithm

is this correct? The link below shows a more accurate pseudo code in my opinion.

http://www.obitko.com/tutorials/genetic-algorithms/ga-basic-description.php —Preceding unsigned comment added by 77.99.241.77 (talk) 14:49, 3 February 2008 (UTC)

[edit] application proliferation

A challenge we have with this article is the very large number of applications and dissertations which can be linked here (generally by their authors) but which can collectively bloat the article. In the recent case, I reverted such an addition just because it introduced a typographical error. I propose we insist that if someone wants to link his thesis, he at least do a good job of it :-) But seriously, theses that help us understand the general subject are welcome. Otherwise, they may be more pertinent in articles about the application rather than the method. Pete St.John (talk) 19:31, 25 February 2008 (UTC)

[edit] Cat: algorithms

Andreas deleted the category "algorithms"; I'll go ask why at his talk. I would agree that GA hasn't gotten much attention from complexity theorists (but maybe should). But surely they are algorithms. Pete St.John (talk) 22:01, 25 February 2008 (UTC)

This article is already listed in categories "Evolutionary algorithms", "Genetic algorithms", "Optimization algorithms" and "Search algorithms". All of these 4 categories are subcategories of "Algortihms" category. The category "Algortihms" should list very few article pages directly and should mainly contain subcategories. This is why I thought that this article don't need to have "Algortihms" category. Andreas Kaufmann (talk) 22:33, 25 February 2008 (UTC)

That makes sense. If in general Algorithms categories are getting hierarchically structured, that's great, and I don't want to interfere with the people doing that good work. So please go ahead. Pete St.John (talk) 18:20, 26 February 2008 (UTC)

[edit] schemata vs cylinder sets

After a quick skim of the concept of Holland schemata, I get the impression that these are one and the same, identical to what the rest of the world calls cylinder sets. Specifically, for example, for binary strings of length 6 e.g; such as the "schemata" 0**1*1, this would be called an open (cylinder) set on the product topology for 2^6. Is there any way in which a schemata is anything other than a cylinder set in the product topology? linas (talk) 21:42, 5 June 2008 (UTC)

[edit] No Mention of the Schema Theorem??

First of all I want to say right up front that this article should not be nominated for anything. It does not clarify this topic at all, but only makes it seems more obscure. I have several things to say here. First of all, we have these various phrases floating around whose meanings should be layed to rest once and for all by wikipedia. Among these are Evolutionary Algorithms, Genetic Algorithms, Genetic Programming, Evolutionary Strategies. We should emphasize that GAs use a phenotype and a genotype in each individual, and that the existence of the genotype is the defining aspect of a GA. The mentioning of a genotype and a phenotype needs to happen right at the top of this article.

Genetic Programming is completely different in that it evolves literally, code strings. There is no aspect of the expresssion of a genotype into a phenotype in Genetic Programming. Therefore it is not the case that Genetic Programming is a "subset" of GAs. Genetic Programming is the most badly-named discipline in computer science!

On the subject of premature convergence of the population into a local maxima : This issue should be isolated and described all by itself in a separate section. As it stands now, this subject is hinted at in various parts of the entire article. The problems of early convergence and exploitation-vs-exploration does not need to be mentioned half-way in a wikipedia article. It can be isolated and talked about in depth. I think this article needs a section titled Exploitation Versus Exploration. And within this section should be all the various descriptions of mutation rates and crossover rates and other related material. It is not a sufficient explanation to say "depending on what the fitness landscape is like blah blah blah". How does suddenly referring to a fitness landscape make this subject more clear to a layman reading the article? Think about it.

Where is the Schema Theorem? I'm sort of shocked that someone could write an entire encyclopedia article on Genetic Algorithms and never even mention the schema theorem. The schema theorem is basically the mathematical justification for genetic algorithms. Someone might come away from this article thinking that Genetic Algorithms are willy-nilly theory. There are two ways to go about this depending on style. One way is to make a section like ---Mathematical Justification--- or perhaps something like ---Schema Theorem--- On a personal note, I am taking offense that this article's claim that mutation is the only thing driving a GA, and that crossover is just a "fancy form of mutation". This is not a debating forum, this is an encyclopedia, therefore we are obligated to write only things we know for sure. paros (talk) 05:01, 6 June 2008 (UTC)

Do you mean Holland's schema theorem? I myself am shocked to observe that the russian work on the renormalization group, which "explains" and gives mathematical credence to this whole subject, is not mentioned. Perhaps its too mathematically abstract, and hard to understand? As to poor article structure, this is typical of many multi-author wikipedia articles. It takes real work to straighten all of it out. linas (talk) 14:42, 6 June 2008 (UTC)[