Self-organizing map

From Wikipedia, the free encyclopedia

The self-organizing map (SOM) is a subtype of artificial neural networks. It is trained using unsupervised learning to produce low dimensional representation of the training samples while preserving the topological properties of the input space. This makes SOM reasonable for visualizing low-dimensional views of high-dimensional data, akin to multidimensional scaling. The model was first described by the Finnish professor Teuvo Kohonen and is thus sometimes referred to as a Kohonen map.

Contents

[edit] The network structure

The self-organizing map is a single layer feedforward network where the output syntaxs are arranged in low dimensional (usually 2D or 3D) grid. Each input is connected to all output neurons. Attached to every neuron there is a weight vector with the same dimensionality as the input vectors. The number of input dimensions is usually a lot higher than the output grid dimension. SOMs are mainly used for dimensionality reduction rather than expansion.

[edit] The learning algorithm

The goal of the learning in the self-organizing map is to associate different parts of the SOM lattice to respond similarly to certain input patterns. This is partly motivated by how visual, auditory or other sensory information is handled in separate parts of the cerebral cortex in the human brain.[1]

The weights of the neurons are initialized either to small random values or sampled evenly from the subspace spanned by the two largest principal component eigenvectors. The latter alternative will speed up the training significantly because the initial weights already give good approximation of SOM weights.[2]

The training utilizes competitive learning. When a training sample is given to the network, its Euclidean distance to all weight vectors is computed. The neuron with weight vector most similar to the input is called the Best Matching Unit (BMU). The weights of the BMU and neurons close to it in the SOM lattice are adjusted towards the input vector. The magnitude of the change decreases with time and is smaller for neurons physically far away from the BMU. The update formula for a neuron with weight vector Wv(t) is

Wv(t + 1) = Wv(t) + Θ(v, t)α(t)(D(t) - Wv(t)),

where α(t) is a monotonically decreasing learning coefficent and D(t) is the input vector. The neighbourhood function Θ(v,t) depends on the lattice distance between the BMU and neuron v. In the simplest form it is one for all neurons close enough to BMU and zero for others, but a gaussian function is a common choice, too. Regardless of the functional form, the neighbourhood function shrinks with time.[1] At the beginning when the neighbourhood is broad, the self-organizing takes place on the global scale. When the neighbourhood has shrunk to just a couple of neurons the weights are converging to local estimates.

This process is repeated for each input vector, over and over, for a (usually large) number of cycles. The network winds up associating output nodes with groups or patterns in the input data set. If these patterns can be named, the names can be attached to the associated nodes in the trained net.

Like most artificial neural networks, the SOM has two modes of operation:

  1. During the training process a map is built, the neural network organises itself, using a competitive process. The network must be given a large number of input vectors, as much as possible representing the kind of vectors that are expected during the second phase (if any). Otherwise, all input vectors must be administered several times...
  2. During the mapping process a new input vector may quickly be given a location on the map, it is automatically classified or categorised. There will be one single winning neuron: the neuron whose weight vector lies closest to the input vector. (This can be simply determined by calculating the Euclidean distance between input vector and weight vector.)

[edit] An example of the algorithm

[edit] Preliminary definitions

The actual training of the net is called vector quantization.

To explain the algorithm in-depth, let us create a 10×10 array of nodes. Each node will contain a weight vector, and will be omniscient of its "physical location", i.e. its location in the array. The weight vector that each node contains will be of the same dimension as the following input vectors. (N.B. Weights in the nodes in the map are initially set to random values.)

Now we need input to feed the map. (Note: the generated map and the given input exist in separate subspaces!) Sticking with the norm, we will create three vectors to represent colors. In the world of computing, colors have the following three components: red, green, and blue. Consequently our input vectors will have three components, each one corresponding to a color space. Our input vectors will now be thus...

R = <255, 0, 0>
G = <0, 255, 0>
B = <0, 0, 255>

[edit] Some variables

Vectors are in bold
t = current iteration
λ = limit on time iteration
Wv = current weight vector
D = target input
Θ(t) = restraint due to distance from BMU - a.k.a the neighbourhood function
α(t) = learning restraint due to time

[edit] Stepping through the algorithm

  1. Randomize the map's nodes' weight vectors
  2. Grab an input vector
  3. Traverse each node in the map
    1. Use Euclidean distance formula to find similarity between the input vector and the map's node's weight vector
    2. Track the node that produces the smallest distance (this node will be called the Best Matching Unit or BMU)
  4. Update the nodes in the neighbourhood of BMU by pulling them closer to the input vector
    1. Wv(t + 1) = Wv(t) + Θ(t)α(t)(D(t) - Wv(t))

[edit] Interpretation

There are two ways to interpret a SOM. Because in the training phase weights of the whole neighborhood are moved in the same direction, similar items tend to excite adjacent neurons. Therefore, SOM forms a semantic map where similar samples are mapped close together and dissimilar apart.

The other way to perceive the neuronal weights is to think them as pointers to the input space. They form a discrete approximation of the distribution of training samples. More neurons point to regions with high training sample concentration and fewer where the samples are scarce.

[edit] References

  1. ^ a b Haykin, Simon (1999). “9. Self-organizing maps”, Neural networks - A comprehensive foundation, 2nd edition, Prentice-Hall. ISBN 0-13-908385-5.
  2. ^ Intro to SOM by Teuvo Kohonen. SOM Toolbox. Retrieved on 2006-06-18.

[edit] See also

[edit] External links