Noisy channel model

From Wikipedia, the free encyclopedia

The noisy channel model is a framework for solving problems where one assumes all attempted words have been accidentally scrambled. Find argmax{word|scramble} by modeling the channel Pr(scramble|word}). Usage in spelling correction, QA, speech recognition, machine translation.

Inputs noisy channel, outputs best guess for the true underlying attempt.

For example, in spelling correction:

1) Watch the noisy channel and automatically train on the user’s behavior. Create Pr(scramble|word) where the scramble is accidentally inserting, deleting, substituting, or transposing that character based on the word (target character).

2) For each word, we calculate MLE using BR:

argmax{ p(word|scramble) } = argmax{ p(scramble|word) p(word) }

Where p(scramble|word) is from a minimum edit distance algorithm and p(word) is from a unigram count.