In probability theory and statistics, the chi-square distribution (also chi-squared or \chi^2  distribution) is one of the most widely used theoretical probability distributions in inferential statistics, e.g., in statistical significance tests.[1][2][3][4] It is useful because, under reasonable assumptions, easily calculated quantities can be proven to have distributions that approximate to the chi-square distribution if the null hypothesis is true.

The best-known situations in which the chi-square distribution are used are the common chi-square tests for goodness of fit of an observed distribution to a theoretical one, and of the independence of two criteria of classification of qualitative data. Many other statistical tests also lead to a use of this distribution, like Friedman's analysis of variance by ranks.



If X_i are k independent, normally distributed random variables with mean 0 and variance 1, then the random variable

Q = \sum_{i=1}^k X_i^2

is distributed according to the chi-square distribution with k degrees of freedom. This is usually written


The chi-square distribution has one parameter: k - a positive integer that specifies the number of degrees of freedom (i.e. the number of X_i)

The chi-square distribution is a special case of the gamma distribution.


Probability density function

A probability density function of the chi-square distribution is

\frac{1}{2^{k/2}\Gamma(k/2)}\,x^{(k/2) - 1} e^{-x/2}&\text{for }x>0,\\
0&\text{for }x\le0,

where \Gamma denotes the Gamma function, which has closed-form values at the half-integers.

Cumulative distribution function

Its cumulative distribution function is:

F(x;k)=\frac{\gamma(k/2,x/2)}{\Gamma(k/2)} = P(k/2, x/2)

where \gamma(k,z) is the lower incomplete Gamma function and P(k, z) is the regularized Gamma function.

Tables of this distribution — usually in its cumulative form — are widely available and the function is included in many spreadsheets and all statistical packages.

Characteristic function

The characteristic function of the Chi-square distribution is


Expected value and variance

If X\sim\chi^2_k then



The median of X\sim\chi^2_k is given approximately by


Information entropy

The information entropy is given by

\int_{-\infty}^\infty f(x;k)\ln(f(x;k)) dx
  2 \Gamma
\left(1 - \frac{k}{2}\right)

where \psi(x) is the Digamma function.

Derivation of the pdf for one degree of freedom

Let  Y = X^2 where  X \sim N(0,1)

then P(Y<y) = P(X^2<y)=P(X<|\sqrt{y}|)=F_x(\sqrt{y})-F_x(-\sqrt{y})

    f_y(y)    = f_x(\sqrt{Y})\frac{\partial(\sqrt{y})}{\partial y}-f_x(-\sqrt{Y})\frac{\partial(-\sqrt{y})}{\partial y}             
              = \frac{1}{\sqrt{2\pi}}e^{\frac{-y}{2}}\frac{1}{2y^{1/2}} + \frac{1}{\sqrt{2\pi}}e^{\frac{-y}{2}}\frac{1}{2y^{1/2}} 
              = \frac{1}{2^{\frac{1}{2}} \Gamma(\frac{1}{2})}y^{\frac{1}{2} -1}e^{\frac{-y}{2}}                                   

Then  Y = X^2 \sim \chi^2(1)

Related distributions and properties

The chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student's t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables divided by their respective degrees of freedom.

Name Statistic
chi-square distribution \sum_{i=1}^k \frac{\left(X_i-\mu_i\right)^2}{\sigma_i^2}
noncentral chi-square distribution \sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2
chi distribution \sqrt{\sum_{i=1}^k \left(\frac{X_i-\mu_i}{\sigma_i}\right)^2}
noncentral chi distribution \sqrt{\sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2}

