Talk:Expectation-maximization algorithm

From Wikipedia, the free encyclopedia

Hello. I have a comment about the recent edit which credits the popularity of EM methods to the convergence proof instead of the ease of formulation. Maybe ease of formulation is not quite on the mark, but it seems that the convergence proof isn't either -- after all, the other methods (gradient ascent, etc) will also converge within some neighborhood of a stationary point, and that's all that's guaranteed for EM methods, right? -- FWIW, my point about ease of formulation was this: suppose you have a method that computes maximum likelihood estimates in the complete data case. When you consider the case of incomplete data, the previous method can often be adapted in a very straightforward way; the canonical example would be Gaussian mixture density estimation, in which computation of mean and variance are just replaced by a weighted mean and a weighted variance. I guess ultimately the reason for popularity is an empirical question; we would have to look at a lot of papers and see what reasons, if any, are given. Happy editing, Wile E. Heresiarch 15:37, 29 Apr 2004 (UTC)

Given that it is hard to pin down people's motivations for liking or disliking an algorithm, perhaps we should drop the sentence from the article. But, based on what I heard at machine learning conferences, EM was preferred due to convergence. I believe that the paper [2] triggered the shift from gradient descent (used in neural networks) to EM. That paper doesn't have a nice simple M-step like mixture of Gaussians (see equations 27-29, and following), so the popularity wasn't based on simplicity, I think. -- hike395 07:34, 1 May 2004 (UTC)

Well, I agree that it's hard to divine motivation. I dropped the sentence that referred to a reason for popularity. Wile E. Heresiarch 22:45, 1 May 2004 (UTC)

You note that EM is only guaranteed to maximize likelihood locally. This point should be made clearly from the start of the article. And yes, other methods such as gradient ascent or conjugate gradient will work just as well. (In fact, EM is closely related to gradient ascent and can be regarded as a "self-scaling" version of gradient ascent (i.e., it picks its own stepsize). In many instances of EM, like the forward-backward and inside-outside algorithms, the E step is actually computing the gradient of the log probability.) 70.16.28.121 01:48, 13 August 2006 (UTC)

[edit] Consistency of symbol font

At present, the symbols that are inline are sometimes in a bigger font than the text, sometimes the same font size (look at subscripted thetas for example). The difference seems to depend on whether there's a "/," in the text string between the math tags. They should at least be consistent - I thought maybe I'd just change them to one or the other, but perhaps it's worth asking first whether there's a preference amongst the original authors, or whether there's a general convention.

[edit] Animated picture

On 3 January 2007 someone has removed this animated image :

Example of applying the EM algorithm to estimate the parameters of a 12-component Bernoulli Mixture Model for the MNIST database of handwritten digits [1].

I thought the image was quite nice, but maybe it shouldn't be at the top of the image as in [3] but more to the bottom. So I thought if someone (with a little bit more knowledge about this topic than me) can write a little piece about the application in question we could move it to the "Examples" section. Skaakt 09:56, 9 February 2007 (UTC)

This image doesn't really show anything about expectation maximization other than it yields incremental improvements at each step. It may make sense in the context of an application though. Sancho 06:18, 11 February 2008 (UTC)

[edit] Simple example?

Can someone contribute a very simple toy example to show how EM works? The one currently there (a mixture of Gaussians) is too long and is highly mathematical. My feeling is that very few people are going to wade through it.

My usual comment on such matters: Wikipedia is supposed to be an encyclopedia, not a maths textbook! 99.9% of readers of an article will not be experts, but will be trying to understand a new subject. Therefore articles should concentrate on building readers' insight, and not on being pedantically correct and complete. That comes later. Now, this may come as a bit of a shock to some of you, but most people regard complex mathematical formulae as obstacles to be overcome (or ignored).

"There's nothing so complicated that it can't be explained simply" (or something to that effect) -- A. Einstein

--84.9.77.220 (talk) 23:23, 21 February 2008 (UTC)

While repetitive, I second the above opinion. And, while I have significant background in science, math, and engineering, a quick read through the EM article does not yield useful insights. I ask that authors exert some effort to write clearly and succinctly in educating the reader in the significance of their topic, as well as causing the reader to clearly understand "how it works". The current EM article does neither, even though I believe the authors are quite competent and expert in their subject. Will you please take the time to educate those of us less knowledgeable in your area? —Preceding unsigned comment added by 214.3.40.3 (talk) 19:48, 11 March 2008 (UTC)

See the new section EM in a nutshell. Comments are appreciated. I'd like to dress up the formatting once I'm more experienced with wiki editing.

Aanthony1243 (talk) 14:15, 22 May 2008 (UTC)

Talk:Expectation-maximization algorithm

From Wikipedia, the free encyclopedia

[edit] Consistency of symbol font

[edit] Animated picture

[edit] Simple example?

Views

Navigation

Interaction

Search