Linear model

From Wikipedia, the free encyclopedia

In statistics the linear model is given by

$Y = X \beta + \varepsilon$

where Y is an n×1 column vector of random variables, X is an n×p matrix of "known" (i.e. observable and non-random) quantities, whose rows correspond to statistical units, β is a p×1 vector of (unobservable) parameters, and ε is an n×1 vector of "errors", which are uncorrelated random variables each with expected value 0 and variance σ².

Much of the theory of linear models is associated with inferring the values of the parameters β and σ². Typically this is done using the method of maximum likelihood, which in the case of normal errors is equivalent (by the Gauss-Markov theorem) to the method of least squares.

1 Assumptions
- 1.1 Multivariate Normal Errors
- 1.2 Rank of X
2 Methods of inference
- 2.1 Maximum likelihood
  - 2.1.1 β
  - 2.1.2 σ2
- 2.2 Accuracy of Maximum Likelihood Estimation
3 Generalizations

[edit] Assumptions

[edit] Multivariate Normal Errors

Often one takes the components of the vector of errors to be independent and normally distributed, giving Y a multivariate normal distribution with mean Xβ and co-variance matrix σ² I, where I is the identity matrix. Having observed the values of X and Y, the statistician must estimate β and σ².

[edit] Rank of X

We usually assume that X is of full rank p, which allows us to invert the p × p matrix $X^{\top} X$ . The essence of this assumption is that the parameters are not linearly dependent upon one another, which would make little sense in a linear model. This also ensures the model is identifiable.

[edit] Methods of inference

[edit] Maximum likelihood

[edit] β

The log-likelihood function (for $ε i$ independent and normally distributed) is

$l(\beta, \sigma^2; Y) = -\frac{n}{2} \log (2 \pi \sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^n \left(Y_i - x_i^{\top} \beta \right)^2$

where $x_i^{\top}$ is the ith row of X. Differentiating with respect to β_j, we get

$\frac{\partial l}{\partial \beta_j} = \frac{1}{\sigma^2} \sum_{i=1}^n x_{ij} \left( Y_i - x_i^{\top} \beta \right)$

so setting this set of p equations to zero and solving for β gives

$X^{\top} X \hat{\beta} = X^{\top} Y$ .

Now, using the assumption that X has rank p, we can invert the matrix on the left hand side to give the maximum likelihood estimate for β:

$\hat{\beta} = (X^{\top} X)^{-1} X^{\top} Y$ .

We can check that this is a maximum by looking at the Hessian matrix of the log-likelihood function.

[edit] σ²

By setting the right hand side of

$\frac{\partial l}{\partial \sigma^2} = -\frac{n}{2\sigma^2} + \frac{1}{2 \sigma^4} \sum_{i=1}^n \left(Y_i - x_i^{\top} \beta \right)^2$

to zero and solving for σ² we find that

$\hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n \left(Y_i - x_i^{\top} \hat{\beta} \right)^2 = \frac{1}{n} \| Y - X \hat{\beta} \|^2$ .

[edit] Accuracy of Maximum Likelihood Estimation

Since we have that Y follows a Multivariate normal distribution with mean Xβ and co-variance matrix σ² I, we can deduce the distribution of the MLE of β:

$\hat{\beta} = (X^{\top} X)^{-1} X^{\top} Y \sim N_p (\beta, (X^{\top}X)^{-1} \sigma^2 )$ .

So this estimate is unbiased for β, and we can show that this variance achieves the Cramér-Rao inequality.

A more complicated argument^[1] shows that

$\hat{\sigma}^2 \sim \frac{\sigma^2}{n} \chi^2_{n-p}$ ;

since a Chi-squared distribution with $n - p$ degrees of freedom has mean $n - p$ , this is only asymptotically unbiased.

[edit] Generalizations

[edit] Generalized least squares

If, rather than taking the variance of ε to be σ²I, where I is the n×n identity matrix, one assumes the variance is σ²M, where M is a known matrix other than the identity matrix, then one estimates β by the method of "generalized least squares", in which, instead of minimizing the sum of squares of the residuals, one minimizes a different quadratic form in the residuals — the quadratic form being the one given by the matrix M^-1:

${\min_{\beta}}\left(y-X\beta\right)'M^{-1}\left(y-X\beta\right)$

This has the effect of "de-correlating" normal errors, and leads to the estimator

$\widehat{\beta}=\left(X'M^{-1}X\right)^{-1}X'M^{-1}y$

which is the best linear unbiased estimator for $β$ . If all of the off-diagonal entries in the matrix M are 0, then one normally estimates β by the method of weighted least squares, with weights proportional to the reciprocals of the diagonal entries.

[edit] Generalized linear models

Generalized linear models, for which rather than

E(Y) = Xβ,

one has

g(E(Y)) = Xβ,

where g is the "link function". The variance is also not restricted to being normal.

An example is the Poisson regression model, which states that

Y_i has a Poisson distribution with expected value e^γ+δx_i.

The link function is the natural logarithm function. Having observed x_i and Y_i for i = 1, ..., n, one can estimate γ and δ by the method of maximum likelihood.

[edit] General linear model

The general linear model (or multivariate regression model) is a linear model with multiple measurements per object. Each object may be represented in a vector.

[edit] See also

ANOVA, or analysis of variance, is historically a precursor to the development of linear models. Here the model parameters themselves are not computed, but X column contributions and their significance are identified using the ratios of within-group variances to the error variance and applying the F test.
Linear regression
Robust regression

Retrieved from "http://en.wikipedia.org../../../l/i/n/Linear_model.html"

Categories: Actuarial science | Statistics | Scientific modeling

Linear model

From Wikipedia, the free encyclopedia

Contents

[edit] Assumptions

[edit] Multivariate Normal Errors

[edit] Rank of X

[edit] Methods of inference

[edit] Maximum likelihood

[edit] β

[edit] σ²

[edit] Accuracy of Maximum Likelihood Estimation

[edit] Generalizations

[edit] Generalized least squares

[edit] Generalized linear models

[edit] General linear model

[edit] See also

Views

Navigation

interaction

Search

In other languages

Linear model

From Wikipedia, the free encyclopedia

Contents

[edit] Assumptions

[edit] Multivariate Normal Errors

[edit] Rank of X

[edit] Methods of inference

[edit] Maximum likelihood

[edit] β

[edit] σ2

[edit] Accuracy of Maximum Likelihood Estimation

[edit] Generalizations

[edit] Generalized least squares

[edit] Generalized linear models

[edit] General linear model

[edit] See also

Views

Navigation

interaction

Search

In other languages

[edit] σ²