Cybenko theorem

From Wikipedia, the free encyclopedia

The Cybenko theorem is a theorem proved by George Cybenko in 1989 that says that a single hidden layer, feed forward neural network is capable of approximating any continuous, multivariate function to any desired degree of accuracy and that failure to map a function arises from poor choices for \mathbf{w}_1, \mathbf{w}_2, \dots , \mathbf{w}_N, \mathbf{\alpha}, and \mathbf{\theta} or an insufficient number of hidden neurons.

[edit] Formal statement

Let \varphi be any continuous sigmoid-type function, e.g., \varphi(\xi) = 1/(1+e^{-\xi}). Then, given any continuous real-valued function f on [0,1]n (or any other compact subset of Rn) and ε > 0, there exist vectors \mathbf{w_1}, \mathbf{w_2}, \dots, \mathbf{w_N}, \mathbf{\alpha} and \mathbf{\theta} and a parameterized function G(\mathbf{\cdot},\mathbf{w},\mathbf{\alpha},\mathbf{\theta}): [0,1]^n \rightarrow R such that

|G(\mathbf{x},\mathbf{w},\mathbf{\alpha},\mathbf{\theta}) - f(x)| < |\epsilon| for all \mathbf{x} \in [0,1]^n

where

G(\mathbf{x},\mathbf{w},\mathbf{\alpha},\mathbf{\theta}) = \sum_{i=1}^N\alpha_i\varphi(\mathbf{w}_i^T\mathbf{x} + \theta_i)

and \mathbf{w}_i \in R^n, \alpha_i, \theta_i \in R, \mathbf{w} = (\mathbf{w}_1, \mathbf{w}_2, \dots \mathbf{w}_N), \mathbf{\alpha} = (\alpha_1, \alpha_2, \dots, \alpha_N), and \mathbf{\theta} = (\theta_1, \theta_2, \dots , \theta_N).

[edit] References

  • Cybenko, G.V. (1989). Approximation by Superpositions of a Sigmoidal function, Mathematics of Control, Signals and Systems, vol. 2 no. 4 pp. 303-314. electronic version
  • Hassoun, M. (1995) Fundamentals of Artificial Neural Networks MIT Press, p.48
Languages