Sigmoid function

From Wikipedia, the free encyclopedia

A sigmoid function is a mathematical function that produces a sigmoid curve — a curve having an "S" shape. Often, sigmoid function refers to the special case of the logistic function shown at right and defined by the formula

$P(t) = \frac{1}{1 + e^{-t}}$

1 Members of the sigmoid family
2 Sigmoid functions in neural networks
3 Double sigmoid function
4 See also

[edit] Members of the sigmoid family

In general, a sigmoid function is real-valued and differentiable, having a non-negative or non-positive first derivative, one local minimum, and one local maximum.

Besides the logistic function, sigmoid functions include the ordinary arc-tangent, the hyperbolic tangent, and the error function. The integral of any smooth, positive, "bump-shaped" function will be sigmoidal, thus the cumulative distribution functions for many common probability distributions are sigmoidal.

The logistic sigmoid function is related to the hyperbolic tangent, e.g., by

$1-2\frac{1}{1+e^{-x}} = - \tanh\frac{x}{2}$

[edit] Sigmoid functions in neural networks

Sigmoid functions are often used in neural networks to introduce nonlinearity in the model and/or to make sure that certain signals remain within a specified range. A popular neural net element computes a linear combination of its input signals, and applies a bounded sigmoid function to the result; this model can be seen as a "smoothed" variant of the classical threshold neuron.

A reason for its popularity in neural networks is because the sigmoid function satisfies

$\frac{d}{dt}{\rm sig}(t) = {\rm sig}(t) - {\rm sig}^2(t) = {\rm sig}(t) \left ( 1 - {\rm sig}(t) \right ).$

The right hand side is a low order polynomial function of $sig(t)$ . Furthermore, the polynomial has factors $sig(t)$ and $1 - sig(t)$ , both of which are simple to compute. Given $sig(t)$ at a particular t, the derivative of the sigmoid function at that t can be obtained by multiplying the two factors together. These relationships result in simplified implementations of artificial neural networks with artificial neurons.