Pseudolikelihood

From Wikipedia, the free encyclopedia

Pseudolikelihood is a measure in statistics that serves as an approximation of the distribution of a random variable. Given a set of random variables X = X1,X2,...Xn and a set E of dependencies between these random variables, where  \lbrace X_i,X_j \rbrace \notin E implies Xi is conditionally independent of Xj given Xi's neighbors, the pseudolikelihood of X = x = (x1,x2,...xn) is

\Pr(X = x) = \prod_i \Pr(X_i = x_i|X_j = x_j\ \mathrm{for\ all}\ \lbrace X_i,X_j \rbrace \in E)

X is a vector of variables, x is a vector of values. The expression X = x above means that each variable Xi in the vector X has a corresponding value xi in the vector x. The expression P(X = x) is the probability that the vector of variables X has values equal to the vector x. Because situations can often be described using state variables ranging over a set of possible values, the expression P(X = x) can therefore represent the probability of a certain state among all possible states allowed by the state variables.

Pseudo-log-likelihood is a similar measure derived from the above expression.

\log \Pr(X = x) = \sum_i \log \Pr(X_i = x_i|X_j = x_j\ \mathrm{for\ all}\ \lbrace X_i,X_j \rbrace \in E)

One use of the pseudolikelihood measure is as an approximation of inference over a Markov network or Bayesian network, as the pseudolikelihood of an assignment to Xi may often be computed more efficiently than the likelihood, particularly when the latter may require marginalization over a large number of variables.

[edit] Citations

  • Besag, J. (1975). Statistical Analysis of Non-Lattice Data. The Statistician, 24(3):179--195.

[edit] See also