Talk:Mutual information

From Wikipedia, the free encyclopedia

I didn't initiate the notice, but the guidelines state that this notice is internal to Wikipedia and are not really for the casual reader's consumption. Any attention that a qualified contributor can give is welcome. Ancheta Wis 23:55, 23 Oct 2004 (UTC)

Noting Category:Pages needing attention, I would say that, while someone may have thought that a good guideline, it is de facto incorrect (and not policy). I, for one, do not agree with that guideline, because it hides the fact that the article needs attention from all those who can edit it and it disclaims to newbies that we know the article isn't as good as it could be. — 131.230.133.185 5 July 2005 19:23 (UTC)

[This article is] poorly explained. --Eequor 03:39, 22 Aug 2004 (UTC)

[edit] Simplify eq?

why not just say:

I(X,Y) = \sum_{x,y} p(x,y) \times \log_2 \frac{p(x,y)}{p(x)\,p(y)}. \!

instead of all the confusing talk about what f and g are? Please elaborate if there is a specific reason why it is done this way. -- BAxelrod 02:08, 19 October 2005 (UTC)

The definitions given in the article are correct. They just happen to be highly formal. Less formal definitions are given in the article on information theory (recently added by me, but I called it transinformation). Whether this level of formality is appropriate for this article is a matter for debate. I tend to think not, because in general, someone who is working at that level of formality is not going to be looking in Wikipedia for a definition, but on the other hand, it "simplifies" matters because then one definition suffices for both the discrete and continuous cases. (i.e. integration over the counting measure is simply ordinary discrete summation.) -- 130.94.162.64 22:53, 2 December 2005 (UTC)
O.K. Simplified the formula. -- 130.94.162.64 05:24, 3 December 2005 (UTC)
Another note: I(X,Y)\, is incorrect. I(X;Y)\, is the accepted usage. Use a semicolon. -- 130.94.162.64 11:35, 4 December 2005 (UTC)

[edit] Mutual information between m random variables

How about adding the mutual information among multiple scalar random variables:

I(y_1;\ldots; y_m)=\sum^m_{i=1}H(y_i)-H(\mathbf{y})

(In reply to unsigned comment above:) Apparently there isn't a single well-defined mutual information for three or more random variables. It is sometimes defined recursively:
I(Y_1; Y_2) = H(Y_1) - H(Y_1 | Y_2),\,
I(Y_1; \ldots ; Y_m) =  I(Y_1; \ldots ; Y_{m-1}) - I(Y_1; \ldots ; Y_{m-1} | Y_m),\, m\geq 3 ,
where I(Y_1; \ldots ; Y_{m-1} | Y_m)= \mathbb E _{Y_m}\{I((Y_1|y_m); \ldots;(Y_{m-1}|y_m))\}.
This definition fits more along the lines of the interpretation of the mutual information as the measure of an intersection of sets, but it can become negative as well as positive for three or more random variables (in contrast to the definition in the comment above, which is always non-negative).
--130.94.162.64 23:15, 19 May 2006 (UTC)

[edit] Source

The formula is from Shannon (1948). This should be written.
Who coined the term "mutual information"? --Henri de Solages 18:41, 7 November 2005 (UTC)