Sigmoid function

From Wikipedia, the free encyclopedia

A sigmoid function is a mathematical function that produces a sigmoid curve — a curve having an "S" shape. Often, sigmoid function refers to the special case of the logistic function shown at right and defined by the formula

P(t) = \frac{1}{1 + e^{-t}}

Contents

[edit] Members of the sigmoid family

In general, a sigmoid function is real-valued and differentiable, having a non-negative or non-positive first derivative, one local minimum, and one local maximum.

Besides the logistic function, sigmoid functions include the ordinary arc-tangent, the hyperbolic tangent, and the error function. The integral of any smooth, positive, "bump-shaped" function will be sigmoidal, thus the cumulative distribution functions for many common probability distributions are sigmoidal.

The logistic sigmoid function is related to the hyperbolic tangent, e.g., by

1-2\frac{1}{1+e^{-x}} = - \tanh\frac{x}{2}

[edit] Sigmoid functions in neural networks

Sigmoid functions are often used in neural networks to introduce nonlinearity in the model and/or to make sure that certain signals remain within a specified range. A popular neural net element computes a linear combination of its input signals, and applies a bounded sigmoid function to the result; this model can be seen as a "smoothed" variant of the classical threshold neuron.

A reason for its popularity in neural networks is because the sigmoid function satisfies

\frac{d}{dt}{\rm sig}(t) = {\rm sig}(t) - {\rm sig}^2(t) = {\rm sig}(t) \left ( 1 - {\rm sig}(t) \right ).

The right hand side is a low order polynomial function of sig(t). Furthermore, the polynomial has factors sig(t) and 1 − sig(t), both of which are simple to compute. Given sig(t) at a particular t, the derivative of the sigmoid function at that t can be obtained by multiplying the two factors together. These relationships result in simplified implementations of artificial neural networks with artificial neurons.

[edit] Double sigmoid function

Double sigmoid curve
Double sigmoid curve

The double sigmoid is a function similar to the sigmoid function with numerous applications. Its general formula is:

y = \mbox{sign}(x-d) \, \Bigg(1-\exp\bigg(-\bigg(\frac{x-d}{s}\bigg)^2\bigg)\Bigg),

where d is its centre and s is the steepness factor.

It is based on the Gaussian curve and graphically it is similar to two identical sigmoids bonded together at the point x = d.

One of its applications is non-linear normalization of a sample, as it has the property of eliminating outliers.

[edit] See also