Talk:Kalman filter

From Wikipedia, the free encyclopedia

Archived talk content: Talk:Kalman filter/Archive 1

Contents

[edit] Excellent page

This is a quite lucid explanation of a (potentially) challenging topic/set of concepts. As such is deserves recognition ... and inclusion, maybe, in the CD version of wikipedia? 132.239.215.69 17:52, 9 August 2006 (UTC)

[edit] Typo

In the UKF section, there are four or five occurrences of sums from 1 to N to reconstruct the estimate and its covariance from the samples. These sums should run from 0 to 2L (see the original paper). I'm not really aware of wikipedia protocol etc. or I'd fix it myself. Best wishes, Nathaniel

It seems like you are indeed right in that the summing indices are wrong (or at least unclear). I've attempted to correct the sums to run from 0 to 2L now. Please review my changes to make sure that they have been succesfully corrected.
Please feel free to make your own edits in wikipedia in the future. It's easy, just read Wikipedia:How to edit a page to learn about wiki-formating. But it would be favourable if you register a username and log in prior to making edits, since that makes it easier to trace article history and users contributions. --Fredrik Orderud 18:32, 25 May 2006 (UTC)

[edit] What does HMM mean?

In section Relationship to recursive Bayesian estimation: "Using these assumptions the probability distribution over all states of the HMM can be written simply as: ..." Can somebody explain (and update in the article)?

HMM is Hidden Markov model. The Kalman filter model can be considered as a HMM, since it is both hidden and Markov. It is hidden because the state (x-vector) is only indirectly observable through the observation model; and it is Markov because the current state only depends on the previous state and is therefore conditionally independent of any state before the previous state. --Fredrik Orderud 01:52, 20 November 2005 (UTC)
This is my first time contributing to a Wikipedia page, so I don't want to tread on any toes. But I think it's a little confusing to say that the Kalman Filter is a type of HMM. Both are latent variable models, for sure, in that the state variables are Hidden or latent. But the normal definition of the HMM is that the latent variables are discrete, and the time update of the latent variables is governed by a transition probability matrix. With the Kalman Filter, the latent variables are continuous, and the time update is giverned by the state matrix, which represents a linear dynamical system. In principle, an HMM that modelled the Kalman filter to arbitrary accuracy could be calculated by discretizing the state variables, and then constructing a transition matrix that modelled the dynamics and the noise term. However, in practice this would need a huge number of states for a Kalman filter where the dimension of state space is much greater than one due to the Curse of Dimensionality. Perhaps it would be more clear to give a more expanded discussion on the relationship between Kalman filters and the HMM. I haven't yet figured how to put a reference into a Wiki discussion page, but a very good summary paper is A unifying review of Linear Gaussian Models by Roweis and Ghahramani (Neural Computation Vol 11 No 2, 1999). Possibly a more accurate summary sentence would be to say that the Kalman filter is analogous to the Hidden Markov Model, where the state variables and distributions are continuous valued, rather than discrete. --Alan1507 09:07, 9 May 2006 (UTC)

I've edited the section on "Underlying Dynamical System" to attempt to clarify the relationship between Kalman Filters and HMMs. However, I am still not happy with the section on Relationship to recursive Bayesian estimation, particularly the sentence that states that the measurements are the observed states of the Hidden Markov Model. The observations in a Hidden Markov Model are used to infer the values of the Hidden States, which, by definition are not directly observed. So I do not understand this statement at all - the states of the Hidden Markov Model are hidden, or latent variables, and these are analogous to the system variables in the Kalman Filter. But it's possible the author had some other meaning in mind - perhaps this could be clarified? --Alan1507 20:42, 9 May 2006 (UTC)

The Relationship to recursive Bayesian estimation section was written by User:Chrislloyd in February 2005, and has remained pretty much untouched since. The section is important, since it (attempts to) relates Kalman filtering into the bigger picture of sequential state estimation which it is a part of, but it could probably be formulated more clearly. Any help in improving the section is therefore greatly appreciated. --Fredrik Orderud 00:23, 10 May 2006 (UTC)

I'll have a go when I've time - need to think it through carefully ;-) --Alan1507 07:51, 10 May 2006 (UTC)

[edit] Relationship to recursive Bayesian estimation

User:Chrislloyd, you added this section (which was then titled "Derivation") around February 2/10. Now that we have a derivation section contributed by User:Orderud, is it still necessary? What does it add? Thanks! — ciphergoth 10:18, 2005 Apr 28 (UTC)

I think the section is still importaint, since it relates Kalman filtering to the "bigger picture" of recursive Bayesian estimation (which Kalman filtering is a part of). --Fredrik Orderud 20:08, 28 Apr 2005 (UTC)
In that case I think it needs substantial work to make its point clear, since I've tried very hard to understand it and come up with very little. Is p(X) the probability density function of X? It doesn't link to probability density function and that latter doesn't mention p(X) having that meaning. How is PDF defined for vectors and joint distributions? I think I can guess, but it's not discussed in probability density function making it a bit demanding to infer. Even Lebesgue integration only defines integration from reals to reals, leaving one to infer how integration of functions such as p: (R x R) -> R is defined (though it seems straightforward to extend it to any real function whose domain is measurable). What does "The probability distribution of updated" mean? What is the denominator unimportant to? What do the probability density functions given at the end mean? How does it all tie together to say something cohesive and substantial? — ciphergoth 22:21, 2005 Apr 28 (UTC)
You're probably right in that it's poorly written (I haven't read until now myself), but it's still very importaint. The variable p(x) is, as you thought, the probability distribution of the state x. The Kalman filter replaces p(x) with a Gaussian distribution parametrized by a state estimate and a covariance. IEEE SignalProc. had a quite straightforward tutorial in 2002, containing the derivation of Kalman filter from a recursive Bayesian estimator. It is absolutely worth a read. --Fredrik Orderud 22:39, 28 Apr 2005 (UTC)
Thanks, that helps a lot! — ciphergoth 23:05, 2005 Apr 28 (UTC)
At a glance it looks like that paper is the basis of this section. I can follow the paper much better, since I can see what it's trying to get at. Unfortunately, it doesnt AFAICT actually prove its assertions about the Kalman filter at all - it just states "if you do this, you get the correct conditional PDFs". If we're going to do the same, we should make it explicit that we're stating without proof that the Kalman filter gives the correct PDFs. I think I can see how to do this. (Also, it's a pity the equations are bitmaps rather than scalable fonts in the PDF of the paper!) — ciphergoth 21:46, 2005 Apr 29 (UTC)
This section is not there to prove the optimality of the Kalman filter. The "proof" section already does that. It's main intent is to demonstrate how recursive Bayestion estimation can be simplified into tractable linear equations with Gaussian PDFs when dealing with linear state-space models subject to Gaussian noise. The derivations are pretty standard, and found in many Kalman textbooks. Your can also find the paper on IEEE Xplore in much higher quality, but this requires an subscription. --Fredrik Orderud 11:03, 30 Apr 2005 (UTC)
OK, but the paper makes it look as if our proof is insufficiently precise, because it talks about expected values, covariance and so forth without talking about what they're conditioned on. Is it
\textbf{P}_{k|k} = \textrm{cov}(\textbf{x}_k - \hat{\textbf{x}}_{k|k})
or
\textbf{P}_{k|k} = \textrm{cov}(\textbf{x}_k - \hat{\textbf{x}}_{k|k}|\textbf{z}_{1 \ldots k})
? It feels as if there's big gaps in our proof that the Kalman filter is valid... — ciphergoth 17:38, 2005 Apr 30 (UTC)
I'm pretty sure \textbf{P}_{k|k} = \textrm{cov}(\textbf{x}_k - \hat{\textbf{x}}_{k|k}|\textbf{z}_{1 \ldots k}), since the Kalman filter is a causal recursive estimator which incorporates the latest measurements available into its estimates. --Fredrik Orderud 11:43, 1 May 2005 (UTC)

[edit] Underlying dynamical system

I removed the reference to a Markov Chain, and replaced it with Probabilistic Graphical Model, as I think this introduces less confusion - although Markov Chains can be defined on continuous variables, it seems the most widely understood definition is as a Finite state machine, as, for example at http://www.nist.gov/dads/HTML/markovchain.html . Hidden Markov Models and Kalman Filters are derived from the same Probabilistic Graphical Model. When time permits, I might write a section illustrating the duality between HMM and the Kalman Filter.

I disagree.
Probabilistic graphical model is a rarely used "nonsense" term, that does not say anything about the specific Bayesian network encountered in Kalman filtering. The process- & observation-models yields a Bayesian network on a special sequential/recursive form, consisting of 1st order Markov chains for state propagation and 0th order Markov chains for the measurements. This form is better known as a Hidden Markov Model with continous state.
Please go change the Markov chain article first is you think that Markov chains are somewhat restricted to systems with discrete state. (no pun intended)
Alternatively, you can use the term "Markov process" instead [1], which undoubtedly covers systems with continous state. --Fredrik Orderud 19:15, 10 May 2006 (UTC)
I agree with Fredrik Orderud - just because the state modelled by a Kalman filter is continuous, doesn't mean it's not a Markov model. I know that it's more usual to use Markov model to refer to things with discrete state but it's not the only application. Please stop removing these references from the article! — ciphergoth 19:40, 10 May 2006 (UTC)

[edit] Inferring backwards?

A Kalman model will use today's observation to estimate today's state. What do you use when you want to use today's observation to improve your estimate of yesterday's state? — ciphergoth 09:59, 30 July 2006 (UTC)

Well one could simply run the Kalman model in reverse, though if the dynamics aren't reversible you might have to modify your choice of state space and dynamics so you can invert the matrix that evolves the system forward in time (I think this should always be doable by adding 'dummy' state variables). Then you can just apply the Kalman model starting from the last time point and evolving it backwards in time.
Now if you mean to ask something more involved, namely how can you use both past and future information together (using all the info together) to improve estimates the question is much harder. One might try to use the Kalman method both in forward and reverse together, for instance instead of the real measurements use the output of the forward Kalman method as input into running it into reverse. However, I think it very likely this sort of technique will not work but I don't really know. In fact my motivation for answer this question was curiousity about whether this sort of use is possible. Logicnazi 22:36, 10 August 2006 (UTC)
There are definitely well-understood techniques that use both past and future information together. I don't think they're based on the sort of Kalman variants you suggest. I just don't know how to look for them because I don't know what they're called. — ciphergoth 10:49, 11 August 2006 (UTC)
Estimation using both past and future information is "interpolation". There is a 1960 paper by Kalman at http://www.elo.utfsm.cl/~ipd481/Papers%20varios/kalman1960.pdf that has some references. Jrvz 20:56, 31 August 2006 (UTC)
"Smoothing" has been utillized extensively in actual applications, and software is currently in use at DoD test ranges based in the technique. I see that smoothing is mentioned in the "examples" paragraph -- it might be enough to include references from the 70's and 80's that developed the algorithms involved. The ones I am aware of came from Bierman, who used the "information filter" formulation for the forward pass, and then performed a backward pass for the smoothing. I'll try to locate the references for possible inclusion.
It occurs to me that the other thing done back then by Bierman and others was to apply factorization techniques to reduce the ranges of numerical values being manipulated (since manipulation was of square roots of quantities instead of the quantities themselves). It might be worthwhile to also include a brief discussion, and references, related to this. paul.wilfong at ngc.com

[edit] Peter Swerling

Peter Swerling was a radar engineer who is most famous for the "Swerling Models" and had many contributions to the field of electrical and electronic engineering. The fact that he discovered the Kalman filter (and published it) before Kalman is simply an interesting side note in his life. Why "Peter Swerling" gets re-directed to this page on the Kalman filter is beyond me. He should have a page of his own with a biography and so on.

[edit] recursive link

The link http://en.wikipedia.org/wiki/Peter_Swerling

entitled "Peter Swerling" leads back to this Kalman Filter Page! Carrionluggage 21:42, 1 August 2006 (UTC)

[edit] Data Fusion - Combining Information from Related Observations

I've been asked to design a Kalman Filter where we can observe several states of the process (some of which have relationships) and to use the Kalman filter to combine related observations to get a better estimate of each.

Some texts I've been reading seem to indicate instead of making a prediction and measurement and using these to form the best estimate, two measurements are combined to form the best estimate (of one of the measurements). In examples, the two measurements seem to be usually a state and its derivative.

I find all of this quite confusing - but if it's a technique used in Kalman filtering, perhaps it needs mentioning?

Let's say, for example, we can measure u,\ v,\ w,\ V_{tot} where u,\ v\ and\ w represent the speed in 3 dimensions and Vtot is the total velocity - i.e. V_{tot}=\sqrt{(u^2+v^2+w^2)}

I guess the rates of change of these variables are also observable. How could the Kalman filter be used here (where functions to make predictions of the next step are unknown)...

Something else that would really be terrific would be an worked example of an ekf with a nonlinear system - many people seem to have difficulty understanding this (myself included, and I've been reading about them for over a year now!). --Ultimâ 20:28, 16 September 2006 (UTC)

[edit] Relationship between Digital and Kalman Filters

The Wikipedia article for Digital Filter has a reference to the Kalman Filter article. Neither has a discussion about the relationship between them, and I know from experience that it would be very helpful to include a brief discussion about the relationship. I have added such, and hope it is acceptable.

paul.wilong at ngc.com

[edit] equation typo

I changed the (innovation (or residual) covariance) equation as it looked as if the R's and P's got switched. If someone who is better versed in state space modeling could double check that this is a correct change I would appreciate it. I'm still a newbie at this state space stuff. Much appreciated.

--(Reply)-- Perhaps some prankster came along and switched the variables. Equation

\textbf{S}_{k} = \textbf{H}_{k}\textbf{P}_{k|k-1} \textbf{H}_{k}^{T}+\textbf{R}_{k}

looks the same as it was last March. Ultimâ 11:57, 14 November 2006 (UTC)