Talk:Cross entropy

From Wikipedia, the free encyclopedia

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: Stub Class Low Priority  Field: Probability and statistics
WikiProject Physics This article is within the scope of WikiProject Physics, which collaborates on articles related to physics.
Stub This article has been rated as Stub-Class on the assessment scale.
??? This article has not yet received an importance rating within physics.

Help with this template

This article has been automatically assessed as Stub-Class by WikiProject Physics because it uses a stub template.
  • If you agree with the assessment, please remove {{Physics}}'s auto=yes parameter from this talk page.
  • If you disagree with the assessment, please change it by editing the class parameter of the {{Physics}} template, removing {{Physics}}'s auto=yes parameter from this talk page, and removing the stub template from the article.

This article uses the notation KL(p, q) and also DKL(p || m) when talking about Kullback-Leibler divergence. Are these notations two ways of expressing the same idea? If so, the article may want to indicate this equivalence.


The log-likelihood of the training data for a multinomial model is the same as the cross-entropy of the data. (Elements of Statistical Learning, page 32)

L(theta) = sum (all classes k) I(G=k) log Pr(G=k | X = x)

I guess "I(G=k)" is p and Pr(G=k | X=x) is q here.

Could somebody in the know please add this? Thanks!

[edit] WikiProject class rating

This article was automatically assessed because at least one WikiProject had rated the article as stub, and the rating on other projects was brought up to Stub class. BetacommandBot 09:46, 10 November 2007 (UTC)