Gaussian adaptation

From Wikipedia, the free encyclopedia

Gaussian adaptation is designed for the maximization of manufacturing yield due to statistical deviation of component values of signal processing systems. The process uses the theorem of Gaussian adaptation stating that: if the centre of gravity of a high-dimensional Gaussian distribution coincides with the centre of gravity of the part of the distribution belonging to some region of acceptability in a state of selective equilibrium, then the hitting probability on the region is maximal.

Alternatively it is also proved that the same condition of optimality maximises the disorder of the Gaussian keeping the yield constant. Altogether, this means that Gaussian adaptation carries out a simultateous maximisation of yield and disorder. Proofs of all theorems related to Gaussian adaptation may be found on the page

[1]

The theorem is valid for all regions of acceptability and all Gaussian distributions. It may be used by cyclic repetition of random variation and selection (like the natural evolution). In every cycle a sufficiently large number of Gaussian distributed points are sampled and tested for membership in the region of acceptability. The centre of gravity of the Gaussian is then moved to the centre of gravity of the approved (selected) points. Thus, the process converges to a state of equilibrium fulfilling the theorem. A solution is always approximate because the centre of gravity is always determined for a limited number of points.

It was used for the first time in 1969 as a pure optimization algorithm making the regions of acceptability smaller and smaller (in analogy to simulated annealing). Since 1970 it has been used for both ordinary optimization and yield maximization.


Contents

[edit] Natural evolution and Gaussian adaptation

It has also been compared to the natural evolution of populations of living organisms. In this case the region of acceptability is replaced by a probability function, s(x), i. e. the probability that the individual having an array x of phenotypes will survive by giving offspring to the next generation. This is possible because the theorem of Gaussian adaptation is valid for any region of acceptability. The yield is replaced by the mean fitness determined as a mean over the set of individuals in a large population.

P(m) = \int s(x) N(m - x) dx

Where N is the p. d. f. of phenotypes in the population. In the case N is a Gaussian it is fairly easily proved that the mean fitness of a large population may be maximized by the process.

In this case the mean fitness is determined as a mean over the set of individuals in a large population. This is in contrast to Fisher's fundamental theorem of natural selection in which the mean is determined over the set of genes using the gene as a unit of selection. This is somewhat dubious because selection takes place on the individual level, ruling the enrichment of genes. Fisher's teorem states that "The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time."

This is in contradiction to the theorem of Gaussian adaptation, which states that the mean fitness and phenotypic disorder (and variance, in accordance with the second law of thermodynamics) may be simultaneously maximal.


As long as the ontogenetic program my be seen as a stepwise modified recapitulation of the evolution of a particular individual organism, the central limit theorem states that the sum of contributions from many random steps tend to become Gaussian distributed.

A necessary condition for the natural evolution to be able to fulfill the theorem of Gaussian adaptation is that it may push the centre of gravity of the Gaussian to the centre of gravity of the selected individuals. This may be accomplished by the Hardy-Weinberg law.

In this case the rules of genetic variation such as crossover, inversion, transposition etcetera may be seen as random number generators for the phenotypes. So, in this sense Gaussian adaptation may be seen as a genetic algorithm.


[edit] How to climb a mountain

Mean fitness may be calculated provided that the distribution of parameters and the structure of the landscape is known. The real landscape is not known, but figure below shows a fictitious profile (blue) of a landscape along a line (x) in a room spanned by such parameters. The red curve is the mean based on the red bell curve at the bottom of figure. It is obtained by letting the bell curve slide along the x-axis, calculating the mean at every location. As can be seen, small peaks and pits are smoothed out. Thus, if evolution is started at A with a relatively small variance (the red bell curve), then climbing will take place on the red curve. The process may get stuck for millions of years at B or C, as long as the hollows to the right of these points remain, and the mutation rate is too small.

Image:Fraktal.gif

If the mutation rate is sufficiently high, the disorder or variance may increase and the parameter(s) may become distributed like the green bell curve. Then the climbing will take place on the green curve, which is even more smoothed out. Because the hollows to the right of B and C have now disappeared, the process may continue up to the peaks at D. But of course the landscape puts a limit on the disorder or variability. Besides - dependent on the landscape - the process may become very jerky, and if the ratio between the time spent by the process at a local peak and the time of transition to the next peak is very high, it may as well look like a punctuated equilibrium as suggested by Gould (see Ridley).

[edit] Computer simulation of Gaussian adaptation

Thus far the theory only considers mean values of continuous distributions corresponding to an infinite number of individuals. In reality however, the number of individuals is always limited, which gives rise to an uncertainty in the estimation of m and M (the moment matrix of the Gaussian). And this may also affect the efficiency of the process. Unfortunately very little is known about this, at least theoretically.

The implementation of normal adaptation on a computer is a fairly simple task. The adaptation of m may be done by one sample (individual) at a time, for example

    m(i+1) = (1 – a) m(i) + ax

where x is a pass sample, and a < 1 a suitable constant so that the inverse of a represents the number of individuals in the population.

    M may in principle be updated after every step y  leading to a feasible point
    x = m + y   according to:
    M(i+1) = (1 – 2b) M(i) + 2byyT,

where yT is the transpose of y and b << 1 is another suitable constant. In order to guarantee a suitable increase of mean information, y should be normally distributed with moment matrix my2M, where the scalar my > 1 is used to increase information (disorder) at a suitable rate. But M will never be used in the calculations. Instead we use the matrix W defined by WWT = M.

Thus, we have y = Wg, where g is normally distributed with the moment matrix myU, and U is the unit matrix. W and WT may be updated by the formulas:
    W = (1 – b)W + bygT   and   WT = (1 – b)WT + bgyT 

because multiplication gives

    M = (1 – 2b)M + 2byyT,

where terms including b2 have been neglected. Thus, M will be indirectly adapted with good approximation. In practice it will suffice to update W only

    W(i+1)  = (1 – b)W(i)  + bygT.

This is the formula used in a simple 2-dimensional model of a brain satisfying the Hebbian rule of associative learning; see the next section.

The figure below illustrates the effect of increased mean information in a Gaussian p.d.f. used to climb a mountain Crest (the two lines represent the contour line). Both the red and green cluster have equal mean fitness, about 65%, but the green cluster has a much higher mean information making the green process much more efficient.

Image:Mountain_crest.GIF

[edit] The evolution in the brain

In the brain the evolution of DNA-messages is supposed to be replaced by an evolution of signal patterns and the phenotypic landscape is replaced by a mental landscape, the complexity of which will hardly be second to the former. The metaphor with the mental landscape is based on the assumption that certain signal patterns give rise to a better well-being or performance. For instance, the control of a group of muscles leads to a better pronunciation of a word or performance of a piece of music.

In this simple model it is assumed that the brain consists of interconnected components that may add, multiply and delay signal values. A nerve cell cernel may add signal values, a synapse may multiply with a constant and an axon may delay values.

In the figure below the brain stem is supposed to deliver Gaussian distributed signals. The triangular boxes represent synapses and the boxes with the + sign are cell kernels.

In the cortex signals are tested for feasibility. When a signal is accepted the contact areas in the synapses are updated according to the formulas below in agreement with the Hebbian theory. The figure shows a 2-dimensional computer simulation according to the last formula in the preceeding section.

Image:brain2.GIF

m and W are updated according to:

m1 = 0.9 m1 + 0.1 x1; m2 = 0.9 m2 + 0.1 x2;

w11 = 0.9 w11 + 0.1 y1g1; w12 = 0.9 w12 + 0.1 y1g2;

w21 = 0.9 w21 + 0.1 y2g1; w22 = 0.9 w22 + 0.1 y2g2;


As can be seen this is very much like a small brain ruled by the theory of Hebbian learning.

See references from 1999 and 2002 below.

[edit] Gaussian adaptation and free will

Gaussian adaptation as an evolutionary model of the brain obeying the Hebbian theory of associative learning - offers an alternative view of free will due to the ability of the process to maximize the mean fitness of signal patterns in the brain by climbing a mental landscape in analogy with phenotypic evolution.

Such a random process gives us lots of freedom of choice, but hardly any will. An illusion of will may, however, emanate from the ability of the process to maximize mean fitness, making the process goal seeking. I. e., it prefers higher peaks in the landscape prior to lower, or better alternatives prior to worse. In this way an illusive will may appear. See Kjellström 1999.

[edit] A theorem of efficiency for random search

The efficiency of Gaussian adaptation relies on the theory of information due to Claude E. Shannon (see mean information). When an event occurs with probability P, then the information -log(P) is achieved. For instance, if the mean fitness is P, the information gained for each individual selected for survival will be -log(P) – on the average - and the work/time needed to get the information is proportional to 1/P. Thus, if efficiency, E, is defined as information divided by the work/time needed to get it we have:

E = - P log(P).

This function attains its maximum when P = 1/e = 0.37. E is = 0 if P = 0, for a process with infinite mutation rate, and if P = 1, for a process with mutation rate = 0 (provided that the process is alive). This measure of efficiency is valid for a large class of random search processes provided that certain conditions are at hand.

1 The search should be statistically independent in different parameters. This condition may be approximately fulfilled when the moment matrix of the Gaussian has been adapted to some region of acceptability, because linear transformations of the whole process do not have an impact on efficiency.

2 All individuals have equal cost and the derivative at P = 1 is < 0.

Then, the following theorem (in its simplest form) may be proved:

All measures of efficiency satisfying the conditions above are asymptotically proportional to –P log(P) when the number of degrees of freedom increases.

A point of view that also may be of interest in this context is that no definition of information (other than that sampled points inside some region of acceptability gives information about the extension of the region) is needed for the proof of the theorem. Then, because, the formula may be interpreted as information divided by the work needed to get the information, this is also an indication that -log(P) is a good candidate for beeing a measure of information.

[edit] See also

Entropy in thermodynamics and information theory

Fisher's fundamental theorem of natural selection

Free will

Genetic algorithms

Hebbian learning

mean information

Simulated annealing

Stochastic optimization

Unit of selection

[edit] References

  • Kjellström, G. Network Optimization by Random Variation of component values. Ericsson Technics, vol. 25, no. 3, pp. 133-151, 1969.
  • Kjellström, G. Optimization of electrical Networks with respect to Tolerance Costs. Ericsson Technics, no. 3, pp. 157-175, 1970.
  • Kjellström, G. & Taxén, L. Stochastic Optimization in System Design. IEEE Trans. on Circ. and Syst., vol. CAS-28, no. 7, July 1981.
  • Kjellström, G. On the Efficiency of Gaussian Adaptation. Journal of Optimization Theory and Applications, vol. 71, no. 3, Dec. 1991.
  • Kjellström, G. & Taxén, L. Gaussian Adaptation, an evolution-based efficient global optimizer; Computational and Applied Mathematics, In, C. Brezinski & U. Kulish (Editors), Elsevier Science Publishers B. V., pp 267-276, 1992.
  • Kjellström, G. Evolution as a statistical optimization algorithm. Evolutionary Theory 11:105-117 (January, 1996).
  • Kjellström, G. The evolution in the brain. Applied Mathematics and Computation, 98(2-3):293-300, February, 1999.
  • Kjellström, G. Evolution in a nutshell and some consequences concerning valuations. EVOLVE, ISBN 91-972936-1-X, Stockholm, 2002.
  • Ridley, M. Evolution. Blackwell Science, 1996.