Talk:Degrees of freedom (statistics)

From Wikipedia, the free encyclopedia

Contents

[edit] Confusion

As a beginner of statistics, I still don't understand what this means.. can anyone explain it in a more layman term? -- —The preceding unsigned comment was added by 219.79.235.199 (talkcontribs).

I still don't get it.. --Blakeops 19:57, 4 June 2006 (UTC)


The way the concept is explained seems latin german french to a layman trying to understand what actually is statistical degree of freedom . To comprehend the page it needs to be put in simple language ... someone please explain ........

Maybe a better definition could be : The number of values in a data set that are free to vary when restrictions are imposed on the set.149.169.118.217 20:53, 18 February 2007 (UTC)Rare.hero

[edit] Cleanup

This page is a really horrible mess. I'll be back ... next week, I hope. Michael Hardy 17:51, 8 August 2005 (UTC) still confusing how is that??

This page is a mess and isn't very helpful. Could someone who knows something about statistics take the time to clean this up and make it useful? 67.170.10.225 06:24, 1 January 2006 (UTC)

[edit] Plagiarism

I just removed content blatantly lifted from http://seamonkey.ed.asu.edu/~alex/computer/sas/df.html by a user at 70.162.179.53. This kind of crap is what makes Wikipedia look terrible. 12 April 2006 -- —The preceding unsigned comment was added by 69.232.193.222 (talkcontribs).

[edit] Link description

Umm.. Isn't the description behind that link incorrect? It says that degrees of freedom are given by the number of (independent) data points minus the number of model parameters, which is correct, but then in the first example "fits" a line to a single data point and says that df = 1-1 = 0. Last time I checked you needed two parameters to determine a line, not one, so shouldn't this be df = 1-2 = -1? The diagram with perfect fitting is correct, but df is said to be 1 whereas I'd expect 0. The third example is supposedly of overfitting, which the text correctly describes as having more model parameters than data points, but the diagram then has a line (still two parameters) fitted to three data points - isn't this just the other way around?!

[1] makes much more sense to me. AFAICS (in a model fitting context) it's about how many degrees of freedom the data has, in principle, available to vary from the best fit the model can produce, and the more it has them the more significant it is if the fit is good nevertheless. I'm not entirely sure that this is a good measure for the significance though. What if all of your models have only one parameter because you use a Space-filling curve to encode the true parameters in one real value? How to prevent a "model" that encodes all data points into one "parameter" using such a construct from being considered as a very good model of all data in the world? It's a pathological case for sure, but seems to me that it makes the df concept ill-defined, unless you can somehow restrict this using the concept of Hausdorff dimension or..? I can't see how the restriction could be defined, though. Maybe this is all nonsense, could someone enlighten me (and edit the article to a more respectable shape while you're at it? ;-) 82.103.198.180 22:07, 22 July 2006 (UTC)

The model fitting the straight line can be one parameter or two, if you fit a model with intercept and slope there are two terms (β0, β1), if you fita sub model without an intercept term there is one parameter estimated (β1). See Linear model for more information.
--Zven 01:53, 25 July 2006 (UTC)

[edit] Degrees of Freedom confusion

Some of the confusion here is due to the term "Degrees of freedom" being used in different ways, with nuanced differences in meaning, in statistical theory and application.

The link [2] makes sense in that it alludes to this issue. Degrees of freedom is used to indicate the number of statistically independent pieces of information available with which to make inference. In statistical terms this is often stated as follows: Let Yi be independent and identically distributed random variables, i = 1, 2, ..., N; arising from a Normal distribution with mean 0 and variance 1 (the Standard Normal distribution).

In this special case, the sum of the squares of the Yi can be shown to have a Chi-square distribution with 'degrees of freedom' equal to N. (See e.g. 3 Thus historically in the early days of the development of statistical theory, 'degrees of freedom' was used to refer to the dimension of the space containing the data of interest.

Hopefully an expert in information theory can chime in here and solidify this use of the term 'degrees of freedom'.

I offer the following to start clarifying the different ways in which 'degrees of freedom' is used. I'll return and re-edit this over time, it's a bit complex to do in one sitting.

I would thus propose that the opening sentence be modified to read something like:

In statistics the term degrees of freedom (df) is used in three ways: (1) to indicate the amount of information available with which to make inference about a characteristic of interest, (2) to indicate the complexity of a mathematical model used in the inference process, and (3) a term referring to a characteristic of various probability distributions.

(1) Information content: 'Degrees of freedom' is a measure of the number of independent pieces of information with which to make inference about a characteristic of interest. Typically the pieces of information are called random variables and the characteristic of interest is the distribution of those random variables, or some mathematical quantity associated with that distribution. More specifically, the degrees of freedom of a set of random variables is the dimension of the space containing the set.

One of the most important distributions is the Normal distribution (the so-called Bell Curve), and its mean (the point about which the random variables cluster) and standard deviation (the degree to which the random variables deviate from the mean) are typically the mathematical quantities of interest.

(2) Model complexity: 'Degrees of freedom' refers to the number of parameters needed to completely specify a mathematical model of interest. For example, if the random variables at hand represent some measured quantity of people such as height or weight, the model of interest might be the Bell Curve that best represents the measured quantities, and two parameters are needed to completely specify the location and spread of that Bell Curve; namely the mean and the standard deviation. More specifically, the mean and standard deviation are a pair of real numbers; thus they can be represented by a point in two-dimensional space. The dimension of the space that can represent the parameters that specify a model of interest is called the 'degrees of freedom' of that model.

(3) Probability distributions: 'Degrees of freedom' refers to a characteristic of a probability distribution, such as the first central moment of the Chi-square distribution. Historically this naming is related to definition (1) above, for if Y1, Y2, ..., Yn are a collection of independent random variables each having a Standard Normal distribution, then the sum of the squares of the random variables has a Chi-square distribution with n 'degrees of freedom'.

In fitting a statistical model to data in N dimensions, the vector of residuals is constrained to lie in a subspace of N-P dimensions, where P is the dimension of the space spanned by the parameters of the statistical model. The total degrees of freedom is N. The model degrees of freedom is P. The residual degrees of freedom is N-P. The ratio of the model degrees of freedom to the residual degrees of freedom is an indication of the amount of information content of a model. This ratio is inversely related to information content - as the amount of information increases, the ratio of the model degrees of freedom to the residual degrees of freedom tends to zero. This condition is important in statistical distribution theory, and is a necessary condition in many theorems concerning the convergence of parameter estimates to the true underlying parameter value.


Rambling potential example perhaps more understandable to laypersons:

Statistical models are chosen typically to summarize or simplify the description of a complex set of information. Simplified descriptions are often extremely useful and allow decisions to be made that otherwise might be too complex. Summarizing a collection of weights of hundreds or thousands of people by modeling those weights as a Bell Curve allows many decisions to be made based on two numbers - the mean and standard deviation of the Bell Curve that best fits those hundreds or thousands of weights. For example, airplane manufacturers need to model the amount of weight that airline passengers will add to an airplane. The manufacturer can either keep a database containing the weights of all potential passengers, a very complex set of information, or they can keep two numbers, the mean and standard deviation of typical travelers. The 'degrees of freedom' of the weights of passengers is large, and complex to assess and maintain. The 'degrees of freedom' of the Bell Curve model is small and manageable: two degrees of freedom representing the mean and standard deviation of the Bell Curve. In order to estimate the mean and standard deviation, measurements of weight of some subset of potential travelers must be collected. No one would put much faith in a mean estimated from the weight of one potential traveler. Collecting the weights of several hundred or several thousand potential travelers will give a better idea of what an airplane manufacturer should plan for - thus the more data that goes in to an estimate of a distribution's parameters, the more 'degrees of freedom' or information content the estimate contains.

I offer the above to help clarify the confusion around 'degrees of freedom', but clearly much editing and refining is needed.

Smckinney2718 18:52, 3 November 2006 (UTC)

[edit] Fractional values?

Who's written that fractional degrees of freedom are possible?? Why add confusion to an already confused article? --Gak 20:59, 8 February 2007 (UTC)

I don't know who added that, but I just added a bit on what they're used for. Michael Hardy 00:13, 9 February 2007 (UTC)