Talk:Viterbi algorithm

From Wikipedia, the free encyclopedia

Contents

[edit] Mathematical descritption

This page should have a description of the algorithm using mathematical formulas, as well. Already I have found two different descriptions of Viterbi and would like Wikipedia to clear things out.

Also Python may be a nice language, but using variable names like "p" or "v_prob" doesn't really explain the algorithm!

Strongly agree! I entered this page to find you have stated what I wanted to say.--Puekai 09:29, 6 March 2007 (UTC)

[edit] Yes I can describe the algorithm in simplified terms

I'm haveing a hard time to figure out where to put the description. There is nothing here that is not correct but it is a little confusing because of the usage of "hidden" and "events" without explaining. The algorithm itself is quite simple but I have a little problem with how it is described. I don't want to exorcise a lot of basically correct text, only to clarify in a more simple manner. I was thinking of adding a section just under the table of contents as an introduction, but being a newbie, I would need some lessons on how to do this.

I can also describe Trellis modulation. --User:jlpayton 9-sept-2005

[edit] .

Can someone show the algorithm here? Kieff | Talk 07:33, Oct 15, 2004 (UTC)

I'll post an example illustrating the forward algorithm and the Viterbi algorithm (those are really the same algorithm). --MarkSweep 21:33, 15 Oct 2004 (UTC)

[edit] Algorithm Please

Python maybe a formally defined language but it isn't a clear language. This page should clearly provide an example of the trellis, dynamic programming, the recursion, the loop invariant. Now this dynamically typed tripe. contributed by 24.69.52.124 a.k.a. S01060060676712c2.gv.shawcable.net

On the contrary, Python is a very clear and concise language. But let's not get into a holy war about programming language preferences. You can cut and past the Python code and run it directly. That should give you a good idea of what's going on. Thank you for your suggestion! When you feel an article needs improvement, please feel free to make whatever changes you feel are needed. Wikipedia is a wiki, so anyone can edit almost any article by simply following the Edit this page link at the top. You don't even need to log in! (Although there are some reasons why you might like to…) The Wikipedia community encourages you to be bold. Don't worry too much about making honest mistakes—they're likely to be found and corrected quickly. If you're not sure how editing works, check out how to edit a page, or use the sandbox to try out your editing skills. New contributors are always welcome. --MarkSweep 19:15, 21 Apr 2005 (UTC)
Sorry, I must admit that I agree with the original poster. I know the Viterbi algorithm, but was completely lost at the example given and then code. My suggestion would be: Drop the "rainy" and "sunny" story (it's utterly confusing), concentrate on some easy system instead (say, something with a transfer function of 1 + 0.5z^-1), and show a graphical description of the trellis and survivor paths. An encyclopedia shouldn't require people to cut-and-paste code to understand what's going on. 80.202.213.120

[edit] Trellis modulation and the Viterbi algorithm

What exactly is the relation between Trellis modulation and the Viterbi algorithm ? --DavidCary 30 June 2005 19:32 (UTC)

Can't totally remember clearly but you should investigate Factor Graphs and see how the viterbi can be done with a factor graph --ReptileLawyer 18:13, 24 April 2006 (UTC)

[edit] Complexity?

Anyone know what the complexity of Viterbi is? I skimmed over it and it looks like it's O(s*n^2), where s is the size of the sequence it's given, and n is the number of possible states. Any thoughts? --aciel 20:41, 22 November 2005 (UTC)

That's correct. Depending on the data structures used to represent the underlying HMM, the Viterbi/Forward algorithm can be made to run in O(s\,n\,d) time, where s is the length of the sequence, n is the number of hidden states, and d is the average out-degree of each state. This is because you only need to explore pairs of states that are actually connected by an edge. In a fully connected HMM, the running time is \Theta(s\,n^2). --MarkSweep (call me collect) 22:11, 23 November 2005 (UTC)

[edit] Is the python correct?

The python contains the line p = ep[state][output] * tp[state][next_state]

which in my understanding should be p = ep[next_state][output] * tp[state][next_state]


I assume the first state returned by the example is the initial state and the second state is the state that matches the first observation

The example returns ['Sunny', 'Rainy', 'Rainy', 'Rainy'] and after modification it returns ['Sunny', 'Sunny', 'Rainy', 'Rainy']

which would seam better given the example (observation 1 = walk)

Am I mistaken? if so a little help on the 4 returned states is welcome.

The code is correct as is. It's written the way it is to simplify the main loop and to allow an empty sequence of observations. For example, here's what happens when you run the code on an empty observation sequence:
 >>> forward_viterbi([], ['q','r'], {'q':0.8, 'r':0.2}, {}, {})
 (1.0, ['q'], 0.80000000000000004)
 >>> 
This return value is the triple (1.0, ['q'], 0.8), which indicates the following: The probability of the observation sequence conditional on its length is 1.0 (that's obviously correct, since there is exactly one sequence with length zero); the Viterbi path consists of a single edge, leading from the (unnamed) start state to the state 'q'; and the probability of the Viterbi path is 0.8. The Viterbi path as returned by this code contains one more state, namely the state that is reached after leaving the state corresponding to the last observation (in this case, the state reached from the start state with probability 0.8). That's because emissions are associated with outgoing edges, rather than incoming edges (you may have seen a presentation of this algorithm that assumes the latter). Both versions of HMMs exist, in addition to a third one where emissions are properly associated with edges, so there are at least three common variants of the forward algorithm and the Viterbi algorithm. --MarkSweep (call me collect) 17:23, 15 December 2005 (UTC)

[edit] SOVA

I intend to add a section on soft output version of the VA....give me some time and let the link stay!!!

Pizzadeliveryboy 02:47, 19 January 2006 (UTC)

[edit] Algorithm seems wrong

I tried to apply the forward algorithm on the example given in the article, but I do not find the same result. Briefly, here is the forward algo I applied (from rabiner paper): Let say λ an HMM system with A the transitional matrix, B the emission matrix, O = O1O2...OT an observation sequence and αt(i) is the porbability of observing O1O2...Ot and having Si as a state q for the HMM at time t so we have:

αt(i) = P(O1,O2...,Ot,qt = Si | λ)

The algo is like:

1- Initialization:

\alpha_1(i) = \pi_ib_{O_1 i} \quad \forall i \quad 1 < i <= N

with πi the start probability of state i

2- Deduction:

\alpha_{t+1}(j) = \Big[\sum_{i=1}^N \alpha_i(t)a_{ij}\Big]b_{O_{t+1} j} \quad \forall j \quad 1 < j <= N

3-Terminaison

P(O|\lambda)=max(\alpha_T(i)) \quad \forall i \quad 1 < i <= N

Applying this algo with this code in python:

def compute_alpha(pi, A, B, O, alpha, t):
  if t == 0:
    for state in pi:
      alpha[0][state] = pi[state] * B[state][O[0]]
  else:
    compute_alpha(pi, A, B, O, alpha, t - 1)
    for state_j in pi:
      temp = 0
      for state_i in pi:
        temp += alpha[t - 1][state_i] * A[state_i][state_j]
      alpha[t][state_j] = temp * B[state_j][O[t]]
def evaluate(pi, A, B, O):
  """Evaluate a suite of states regarding a MM model"""
  alpha = dict()
  alpha[0] = dict()
  for i, state in enumerate(pi):
    alpha[i+1] = dict()
  T = len(O) - 1
  compute_alpha(pi, A, B, O, alpha, T)
  max = 0
  for state in alpha[T]:
    if max <= alpha[T][state]:
      max = alpha[T][state]
  print alpha
  return max

I got 0.02904 as a proba for the observation {'walk', 'shop', 'clean'} given in the example. So which algo is wrong? I can give you the step of calculation, they are quite simple for such a small example.JeDi 05:14, 27 July 2006 (UTC)

[edit] A concrete example is partially wrong

From the transition matrix one can observe initial state probabilities, namely for given transition matrix probability of rainy day is 4/7, while probability of sunny day is 3/7. It appears that values in the example are rounded. 08:20, 28 December 2005 (UTC)

[edit] logs, comments

I realize it would make the code more complicated, but the implementation might be misleading because you really want to use logs of the probabilities to avoid underflows. I think the names of variables could be improved.

I think the output should have the same number of states as the number of observations. The observations are usually associated with states, not transitions between states.