Leave-one-out error

For a broader coverage related to this topic, see Cross-validation (statistics).

\forall i\in\{1,...,m\}, \mathbb{P}_S\{\sup_{z\in Z}|V(f_S,z_i)-V(f_{S^{|i}},z_i)|\leq\beta_{CV}\}\geq1-\delta_{CV}

\forall i\in\{1,...,m\}, \mathbb{P}_S\{|I[f_S]-\frac{1}{m}\sum_{i=1}^m V(f_{S^{|i}},z_i)|\leq\beta_{EL}^m\}\geq1-\delta_{EL}^m, with \beta_{EL}^mand \delta_{EL}^m going to zero for n\rightarrow\inf

Preliminary notations

X and Y ⊂ R being respectively an input and an output space, we consider a training set

S = \{z_1 = (x_1,\ y_1)\ ,..,\ z_m = (x_m,\ y_m)\} of size m in Z = X \times Y drawn i.i.d. from an unknown distribution D. A learning algorithm is a function f from Z_m into  F \subset YX which maps a learning set S onto a function f_S from X to Y. To avoid complex notation, we consider only deterministic algorithms. It is also assumed that the algorithm f is symmetric with respect to S, i.e. it does not depend on the order of the elements in the training set. Furthermore, we assume that all functions are measurable and all sets are countable which does not limit the interest of the results presented here.

The loss of an hypothesis f with respect to an example z = (x,y) is then defined as V(f,z) = V(f(x),y). The empirical error of f is I_S[f] = \frac{1}{n}\sum V(f,z_i).

The true error of f is I[f] = \mathbb{E}_z V(f,z)

Given a training set S of size m, we will build, for all i = 1....,m, modified training sets as follows:

S^{|i} = \{z_1 ,...,\ z_{i-1},\ z_{i+1},...,\ z_m\}

S^i = \{z_1 ,...,\ z_{i-1},\ z_i',\ z_{i+1},...,\ z_m\}

References

This article is issued from Wikipedia - version of the Wednesday, January 13, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.