Cybenko Theorem

From Wikipedia, the free encyclopedia

There are very few or no other articles that link to this one.
Please help introduce links in articles on related topics. After links have been created, remove this message.
This article has been tagged since November 2006.

Formulated by Cybenko in 1989, the theorem states:

Let $\varphi$ be any continuous sigmoid-type function [e.g., $\varphi(\xi) = 1/(1+e^{-\xi})$ ]. Then, given any continuous real-valued function $f$ on $[0,1] n$ (or any other compact subset of $R n$ ) and $ε > 0$ , there exists vectors $\mathbf{w_1}, \mathbf{w_2}, ..., \mathbf{w_N}, \mathbf{\alpha}$ and $\mathbf{\theta}$ and a parameterized function $G(\mathbf{\cdot},\mathbf{w},\mathbf{\alpha},\mathbf{\theta}): [0,1]^n \rightarrow R$ such that

$|G(\mathbf{x},\mathbf{w},\mathbf{\alpha},\mathbf{\theta}) - f(x)| < |\epsilon|$ for all $\mathbf{x} \in [0,1]^n$

where

$G(\mathbf{x},\mathbf{w},\mathbf{\alpha},\mathbf{\theta}) = \sum_{i=1}^N\alpha_j\varphi(\mathbf{w}_j^T\mathbf{x} + \theta_j)$

and $\mathbf{w}_j \in R^n, \alpha_j, \theta_j \in R, \mathbf{w} = (\mathbf{w}_1, \mathbf{w}_2, ... \mathbf{w}_N), \mathbf{\alpha} = (\alpha_1, \alpha_2, ... \alpha_N),$ and $\mathbf{\theta} = (\theta_1, \theta_2, ..., \theta_N)$ .

This theorem says that a single hidden layer, feed forward neural network is capable of approximating any continuous, multivariate function to the desired degree of accuracy and that failure to map a function arises from poor choices for $\mathbf{w}_1, \mathbf{w}_2, ..., \mathbf{w}_N, \mathbf{\alpha},$ and $\mathbf{\theta}$ or an insufficient number of hidden neurons.