Inside–outside algorithm

In computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar. It was introduced James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars. It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

Inside and outside probabilities

The inside probability \beta_j(p,q) is the total probability of generating words w_p \cdots w_q, given the root nonterminal N^j and a grammar G:[1]

\beta_j(p,q) = P(w_{pq}|N^j_{pq}, G)

The outside probability \alpha_j(p,q) is the total probability of beginning with the start symbol N^1 and generating the nonterminal N^j_{pq} and all the words outside w_p \cdots w_q, given a grammar G:[1]

\alpha_j(p,q) = P(w_{1(p-1)}, N^j_{pq}, w_{(q+1)m}|G)

Computing Inside probabilities

Base Case:

\beta_j(p,p) = P(w_{p}|N^j, G)

General Case:

Suppose there is a rule  N_j \rightarrow N_r N_s in the grammar, then the probability of generating w_p \cdots w_q starting with a subtree rooted at N_j is:


\sum_{k=p}^{k=q-1} P(N_j \rightarrow N_r N_s)\beta_r(p,k) \beta_s(k+1,q)

The inside probability \beta_j(p,q) is just the sum over all such possible rules:


\beta_j(p,q) = \sum_{N_r,N_s} \sum_{k=p}^{k=q-1} P(N_j \rightarrow N_r N_s)\beta_r(p,k) \beta_s(k+1,q)

Computing Outside probabilities

Base Case:


\alpha_j(1,n) = \begin{cases}
1 & \mbox{if } j=1 \\
0 & \mbox{otherwise}
\end{cases}

Here the start symbol is N_1.

References

  1. 1.0 1.1 Manning, Christopher D.; Hinrich Schütze (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA, USA: MIT Press. pp. 388–402. ISBN 0-262-13360-1.

External links