Talk:Cross entropy
From Wikipedia, the free encyclopedia
This article uses the notation KL(p, q) and also DKL(p || m) when talking about Kullback-Leibler divergence. Are these notations two ways of expressing the same idea? If so, the article may want to indicate this equivalence.
The log-likelihood of the training data for a multinomial model is the same as the cross-entropy of the data. (Elements of Statistical Learning, page 32)
L(theta) = sum (all classes k) I(G=k) log Pr(G=k | X = x)
I guess "I(G=k)" is p and Pr(G=k | X=x) is q here.
Could somebody in the know please add this? Thanks!
[edit] WikiProject class rating
This article was automatically assessed because at least one WikiProject had rated the article as stub, and the rating on other projects was brought up to Stub class. BetacommandBot 09:46, 10 November 2007 (UTC)