Talk:Kalman filter

From Wikipedia, the free encyclopedia

This article is within the scope of the following WikiProjects:

WikiProject Mathematics (Rated B-Class)

Mathematics Portal

This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.

Mathematics rating:

B Class

Mid Priority

Field: Applied mathematics

One of the 500 most frequently viewed mathematics articles.

Please update this rating as the article progresses, or if the rating is inaccurate.
Please add to or update the comments to suggest improvements to the article.
--Cronholm¹⁴⁴ 21:14, 15 June 2007 (UTC)

WikiProject Robotics (Rated B-Class)
	Kalman filter is within the scope of WikiProject Robotics, an attempt to standardise coverage of Robotics. If you would like to participate, you can edit the article attached to this notice, or visit the project page, where you can join the project or contribute to the discussion.
B	This article has been rated as B-Class on the Project's quality scale. See comments
High	This article has been rated as high-importance on the importance scale.

WikiProject Systems (Rated B-Class)

Systems science Portal

This article is within the scope of WikiProject Systems, which collaborates on articles related to Systems science.

Systems rating:

B Class

High importance

Field: Control theory

Please update this rating as the article progresses, or if the rating is inaccurate.
Please add to or update the comments to suggest improvements to the article.
--Cronholm¹⁴⁴ 21:14, 15 June 2007 (UTC)

This article has an assessment summary page.

Archived talk content: Talk:Kalman filter/Archive 1

1 Suggestion on notation between deterministic and stochastic variables
2 Suggestion on new sub-topic for Kalman Filter
3 On the Unscented Kalman Filter
4 Excellent page
5 Typo
6 What does HMM mean?
7 Relationship to recursive Bayesian estimation
8 Underlying dynamical system
9 Inferring backwards?
10 Peter Swerling
11 recursive link
12 Data Fusion - Combining Information from Related Observations
13 Relationship between Digital and Kalman Filters
14 equation typo
15 Kalman filter implementation
16 Unscented Kalman Filter Questions
17 Equation Fix Suggestion

[edit] Suggestion on notation between deterministic and stochastic variables

Not all variables are stochastic. An estimate is a deterministic value, but its estimator is often represented using the same notation and it is a stochastic variable. right? The input is usually seen as being deterministic, as well as the process and observation matrices (hence they can step out of the expected value). But what about the state vector? The state is well defined no (deterministic I mean)? But at the same time it is governed by a stochastic evolution in time (process noise). I'm confused. —Preceding unsigned comment added by 77.54.101.212 (talk) 14:52, 5 June 2008 (UTC)

[edit] Suggestion on new sub-topic for Kalman Filter

The only mention that KF gets in the area of econometrics is for the page to say it is used in teh reference section at the bottom. Can we make a new topic on the page, stating how it is used, and why. There is not too much in the literature about this, due to how secretive the work is by the companies that are using it (i guess) but there are some good papers (GROENEWOLD & FRASER). The useage seems to center around the fact that CAPM Beta is unstable across time, and that use of KF can lead to improved stability - pretty important if you are taking your trading signals from Mod Beta. Any experts on this topic here?

[edit] On the Unscented Kalman Filter

In the UKF, what is the significance of the term "unscented"? If not embarrassing, it should be included in the description...

Nice KF page overall - it gives enough on each point for more to investigate further, or if you just need the equations in a hurry.

I have some comments on UKF though. Personally, I think it needs a separate page - if nothing else but for the number of different unscented transforms that are described by both Julier and ver der Merwe etc.

The other thing that isn't really clear is when calculating the cross-covariance matrix. (it isn't a cross-correlation matrix, sorry, I made that change to covariance there). The problem is that often the state vector and the observation vector are not of the same length (e.g. GPS/INS integration), and therefore, over which indicies are you summing, and what are the corresponding weights to use - The Wc as calculated from the Sigma Points from the state, the Sigma points as calculated from the observation, some combination, or do you need to explicitly augment the state **and** observation noises into the augmented state vector - something that I didn;t think was absolutely nessessary (though most papers tend to do it that way anyway). Damien d 12:04, 30 April 2007 (UTC)

[edit] Excellent page

This is a quite lucid explanation of a (potentially) challenging topic/set of concepts. As such is deserves recognition ... and inclusion, maybe, in the CD version of wikipedia? 132.239.215.69 17:52, 9 August 2006 (UTC)

[edit] Typo

In the UKF section, there are four or five occurrences of sums from 1 to N to reconstruct the estimate and its covariance from the samples. These sums should run from 0 to 2L (see the original paper). I'm not really aware of wikipedia protocol etc. or I'd fix it myself. Best wishes, Nathaniel

It seems like you are indeed right in that the summing indices are wrong (or at least unclear). I've attempted to correct the sums to run from 0 to 2L now. Please review my changes to make sure that they have been succesfully corrected.

Please feel free to make your own edits in wikipedia in the future. It's easy, just read Wikipedia:How to edit a page to learn about wiki-formating. But it would be favourable if you register a username and log in prior to making edits, since that makes it easier to trace article history and users contributions. --Fredrik Orderud 18:32, 25 May 2006 (UTC)

[edit] What does HMM mean?

In section Relationship to recursive Bayesian estimation: "Using these assumptions the probability distribution over all states of the HMM can be written simply as: ..." Can somebody explain (and update in the article)?

HMM is Hidden Markov model. The Kalman filter model can be considered as a HMM, since it is both hidden and Markov. It is hidden because the state (x-vector) is only indirectly observable through the observation model; and it is Markov because the current state only depends on the previous state and is therefore conditionally independent of any state before the previous state. --Fredrik Orderud 01:52, 20 November 2005 (UTC)

This is my first time contributing to a Wikipedia page, so I don't want to tread on any toes. But I think it's a little confusing to say that the Kalman Filter is a type of HMM. Both are latent variable models, for sure, in that the state variables are Hidden or latent. But the normal definition of the HMM is that the latent variables are discrete, and the time update of the latent variables is governed by a transition probability matrix. With the Kalman Filter, the latent variables are continuous, and the time update is giverned by the state matrix, which represents a linear dynamical system. In principle, an HMM that modelled the Kalman filter to arbitrary accuracy could be calculated by discretizing the state variables, and then constructing a transition matrix that modelled the dynamics and the noise term. However, in practice this would need a huge number of states for a Kalman filter where the dimension of state space is much greater than one due to the Curse of Dimensionality. Perhaps it would be more clear to give a more expanded discussion on the relationship between Kalman filters and the HMM. I haven't yet figured how to put a reference into a Wiki discussion page, but a very good summary paper is A unifying review of Linear Gaussian Models by Roweis and Ghahramani (Neural Computation Vol 11 No 2, 1999). Possibly a more accurate summary sentence would be to say that the Kalman filter is analogous to the Hidden Markov Model, where the state variables and distributions are continuous valued, rather than discrete. --Alan1507 09:07, 9 May 2006 (UTC)

I've edited the section on "Underlying Dynamical System" to attempt to clarify the relationship between Kalman Filters and HMMs. However, I am still not happy with the section on Relationship to recursive Bayesian estimation, particularly the sentence that states that the measurements are the observed states of the Hidden Markov Model. The observations in a Hidden Markov Model are used to infer the values of the Hidden States, which, by definition are not directly observed. So I do not understand this statement at all - the states of the Hidden Markov Model are hidden, or latent variables, and these are analogous to the system variables in the Kalman Filter. But it's possible the author had some other meaning in mind - perhaps this could be clarified? --Alan1507 20:42, 9 May 2006 (UTC)

The Relationship to recursive Bayesian estimation section was written by User:Chrislloyd in February 2005, and has remained pretty much untouched since. The section is important, since it (attempts to) relates Kalman filtering into the bigger picture of sequential state estimation which it is a part of, but it could probably be formulated more clearly. Any help in improving the section is therefore greatly appreciated. --Fredrik Orderud 00:23, 10 May 2006 (UTC)

I'll have a go when I've time - need to think it through carefully ;-) --Alan1507 07:51, 10 May 2006 (UTC)

[edit] Relationship to recursive Bayesian estimation

User:Chrislloyd, you added this section (which was then titled "Derivation") around February 2/10. Now that we have a derivation section contributed by User:Orderud, is it still necessary? What does it add? Thanks! — ciphergoth 10:18, 2005 Apr 28 (UTC)

I think the section is still importaint, since it relates Kalman filtering to the "bigger picture" of recursive Bayesian estimation (which Kalman filtering is a part of). --Fredrik Orderud 20:08, 28 Apr 2005 (UTC)

In that case I think it needs substantial work to make its point clear, since I've tried very hard to understand it and come up with very little. Is p(X) the probability density function of X? It doesn't link to probability density function and that latter doesn't mention p(X) having that meaning. How is PDF defined for vectors and joint distributions? I think I can guess, but it's not discussed in probability density function making it a bit demanding to infer. Even Lebesgue integration only defines integration from reals to reals, leaving one to infer how integration of functions such as p: (R x R) -> R is defined (though it seems straightforward to extend it to any real function whose domain is measurable). What does "The probability distribution of updated" mean? What is the denominator unimportant to? What do the probability density functions given at the end mean? How does it all tie together to say something cohesive and substantial? — ciphergoth 22:21, 2005 Apr 28 (UTC)

You're probably right in that it's poorly written (I haven't read until now myself), but it's still very importaint. The variable p(x) is, as you thought, the probability distribution of the state x. The Kalman filter replaces p(x) with a Gaussian distribution parametrized by a state estimate and a covariance. IEEE SignalProc. had a quite straightforward tutorial in 2002, containing the derivation of Kalman filter from a recursive Bayesian estimator. It is absolutely worth a read. --Fredrik Orderud 22:39, 28 Apr 2005 (UTC)

Thanks, that helps a lot! — ciphergoth 23:05, 2005 Apr 28 (UTC)

At a glance it looks like that paper is the basis of this section. I can follow the paper much better, since I can see what it's trying to get at. Unfortunately, it doesnt AFAICT actually prove its assertions about the Kalman filter at all - it just states "if you do this, you get the correct conditional PDFs". If we're going to do the same, we should make it explicit that we're stating without proof that the Kalman filter gives the correct PDFs. I think I can see how to do this. (Also, it's a pity the equations are bitmaps rather than scalable fonts in the PDF of the paper!) — ciphergoth 21:46, 2005 Apr 29 (UTC)

This section is not there to prove the optimality of the Kalman filter. The "proof" section already does that. It's main intent is to demonstrate how recursive Bayestion estimation can be simplified into tractable linear equations with Gaussian PDFs when dealing with linear state-space models subject to Gaussian noise. The derivations are pretty standard, and found in many Kalman textbooks. Your can also find the paper on IEEE Xplore in much higher quality, but this requires an subscription. --Fredrik Orderud 11:03, 30 Apr 2005 (UTC)

OK, but the paper makes it look as if our proof is insufficiently precise, because it talks about expected values, covariance and so forth without talking about what they're conditioned on. Is it

$\textbf{P}_{k|k} = \textrm{cov}(\textbf{x}_k - \hat{\textbf{x}}_{k|k})$

$\textbf{P}_{k|k} = \textrm{cov}(\textbf{x}_k - \hat{\textbf{x}}_{k|k}|\textbf{z}_{1 \ldots k})$

? It feels as if there's big gaps in our proof that the Kalman filter is valid... — ciphergoth 17:38, 2005 Apr 30 (UTC)

I'm pretty sure $\textbf{P}_{k|k} = \textrm{cov}(\textbf{x}_k - \hat{\textbf{x}}_{k|k}|\textbf{z}_{1 \ldots k})$ , since the Kalman filter is a causal recursive estimator which incorporates the latest measurements available into its estimates. --Fredrik Orderud 11:43, 1 May 2005 (UTC)

[edit] Underlying dynamical system

I removed the reference to a Markov Chain, and replaced it with Probabilistic Graphical Model, as I think this introduces less confusion - although Markov Chains can be defined on continuous variables, it seems the most widely understood definition is as a Finite state machine, as, for example at http://www.nist.gov/dads/HTML/markovchain.html . Hidden Markov Models and Kalman Filters are derived from the same Probabilistic Graphical Model. When time permits, I might write a section illustrating the duality between HMM and the Kalman Filter.

I disagree.

Probabilistic graphical model is a rarely used "nonsense" term, that does not say anything about the specific Bayesian network encountered in Kalman filtering. The process- & observation-models yields a Bayesian network on a special sequential/recursive form, consisting of 1st order Markov chains for state propagation and 0th order Markov chains for the measurements. This form is better known as a Hidden Markov Model with continous state.

Please go change the Markov chain article first is you think that Markov chains are somewhat restricted to systems with discrete state. (no pun intended)

Alternatively, you can use the term "Markov process" instead [1], which undoubtedly covers systems with continous state. --Fredrik Orderud 19:15, 10 May 2006 (UTC)

I agree with Fredrik Orderud - just because the state modelled by a Kalman filter is continuous, doesn't mean it's not a Markov model. I know that it's more usual to use Markov model to refer to things with discrete state but it's not the only application. Please stop removing these references from the article! — ciphergoth 19:40, 10 May 2006 (UTC)

[edit] Inferring backwards?

A Kalman model will use today's observation to estimate today's state. What do you use when you want to use today's observation to improve your estimate of yesterday's state? — ciphergoth 09:59, 30 July 2006 (UTC)

Well one could simply run the Kalman model in reverse, though if the dynamics aren't reversible you might have to modify your choice of state space and dynamics so you can invert the matrix that evolves the system forward in time (I think this should always be doable by adding 'dummy' state variables). Then you can just apply the Kalman model starting from the last time point and evolving it backwards in time.

Now if you mean to ask something more involved, namely how can you use both past and future information together (using all the info together) to improve estimates the question is much harder. One might try to use the Kalman method both in forward and reverse together, for instance instead of the real measurements use the output of the forward Kalman method as input into running it into reverse. However, I think it very likely this sort of technique will not work but I don't really know. In fact my motivation for answer this question was curiousity about whether this sort of use is possible. Logicnazi 22:36, 10 August 2006 (UTC)

There are definitely well-understood techniques that use both past and future information together. I don't think they're based on the sort of Kalman variants you suggest. I just don't know how to look for them because I don't know what they're called. — ciphergoth 10:49, 11 August 2006 (UTC)

Estimation using both past and future information is "interpolation". There is a 1960 paper by Kalman at http://www.elo.utfsm.cl/~ipd481/Papers%20varios/kalman1960.pdf that has some references. Jrvz 20:56, 31 August 2006 (UTC)

"Smoothing" has been utillized extensively in actual applications, and software is currently in use at DoD test ranges based in the technique. I see that smoothing is mentioned in the "examples" paragraph -- it might be enough to include references from the 70's and 80's that developed the algorithms involved. The ones I am aware of came from Bierman, who used the "information filter" formulation for the forward pass, and then performed a backward pass for the smoothing. I'll try to locate the references for possible inclusion.

It occurs to me that the other thing done back then by Bierman and others was to apply factorization techniques to reduce the ranges of numerical values being manipulated (since manipulation was of square roots of quantities instead of the quantities themselves). It might be worthwhile to also include a brief discussion, and references, related to this. paul.wilfong at ngc.com

[edit] Peter Swerling

Peter Swerling was a radar engineer who is most famous for the "Swerling Models" and had many contributions to the field of electrical and electronic engineering. The fact that he discovered the Kalman filter (and published it) before Kalman is simply an interesting side note in his life. Why "Peter Swerling" gets re-directed to this page on the Kalman filter is beyond me. He should have a page of his own with a biography and so on.

[edit] recursive link

The link http://en.wikipedia.org/wiki/Peter_Swerling

entitled "Peter Swerling" leads back to this Kalman Filter Page! Carrionluggage 21:42, 1 August 2006 (UTC)

[edit] Data Fusion - Combining Information from Related Observations

I've been asked to design a Kalman Filter where we can observe several states of the process (some of which have relationships) and to use the Kalman filter to combine related observations to get a better estimate of each.

Some texts I've been reading seem to indicate instead of making a prediction and measurement and using these to form the best estimate, two measurements are combined to form the best estimate (of one of the measurements). In examples, the two measurements seem to be usually a state and its derivative.

I find all of this quite confusing - but if it's a technique used in Kalman filtering, perhaps it needs mentioning?

Let's say, for example, we can measure $u,\ v,\ w,\ V_{tot}$ where $u,\ v\ and\ w$ represent the speed in 3 dimensions and $V t o t$ is the total velocity - i.e. $V_{tot}=\sqrt{(u^2+v^2+w^2)}$

I guess the rates of change of these variables are also observable. How could the Kalman filter be used here (where functions to make predictions of the next step are unknown)...

Something else that would really be terrific would be an worked example of an ekf with a nonlinear system - many people seem to have difficulty understanding this (myself included, and I've been reading about them for over a year now!). --Ultimâ 20:28, 16 September 2006 (UTC)

[edit] Relationship between Digital and Kalman Filters

The Wikipedia article for Digital Filter has a reference to the Kalman Filter article. Neither has a discussion about the relationship between them, and I know from experience that it would be very helpful to include a brief discussion about the relationship. I have added such, and hope it is acceptable.

paul.wilong at ngc.com

[edit] equation typo

I changed the (innovation (or residual) covariance) equation as it looked as if the R's and P's got switched. If someone who is better versed in state space modeling could double check that this is a correct change I would appreciate it. I'm still a newbie at this state space stuff. Much appreciated.

--(Reply)-- Perhaps some prankster came along and switched the variables. Equation

$\textbf{S}_{k} = \textbf{H}_{k}\textbf{P}_{k|k-1} \textbf{H}_{k}^{T}+\textbf{R}_{k}$

looks the same as it was last March. Ultimâ 11:57, 14 November 2006 (UTC)

[edit] Kalman filter implementation

Can be a good improvement to add an implementation part to the article. Can be usefull have some indication on the practical right algoritm to implement a robust, against ill conditioning, implementation of KF or EKF. Particularly, wich is the most indicated matrix decomposition to use for that. Thanks Mauro Oliverone 19:19, 3 April 2007 (UTC)oliverone.

[edit] Unscented Kalman Filter Questions

I'm new to UKF. I read the section on it, but I'm having a hard time understanding with the notation. Take, for example, this step in the UKF prediction:

$\chi_{k|k-1}^{i} = f(\chi_{k-1|k-1}^{i}) \quad i = 0..2L$

Here's where I'm stuck. $f$ is a function of $x$ , the state vector. Let's call its dimension $N$ . On the other hand, the sigma points, $χ i$ , have dimension $2 N$ (if I understand everything correctly). So I don't quite understand how to evaluate $f (χ i)$ . Is it really $f (χ i (1: N))$ (to borrow MATLAB notation)?

The same thing applies when evaluating $h (χ i)$ in the UKF update.

[edit] Equation Fix Suggestion

I think unnecessary confusion is caused by using F in both the linear and non-linear Kalman filter equations. 'A' should be used for the first instance as the state transition matrix (this also means amending the diagram). In the second instance, F is the Jacobian of the state transition matrix, the state matrix and the output matrix- ie. it should not be referred to as the state transition matrix. --Ultimâ (talk) 10:57, 19 May 2008 (UTC)