Talk:Estimator

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: Start Class Mid Priority  Field: Probability and statistics

Contents

[edit] UNNEEDED PHRASE: "FOR ALL THETA"

θ is an unbiased estimator of θ iff B(θ) = 0 for all θ

I'm not sure why we need "for all θ". I thought it was implied that there was only one parameter θ. Perhaps the discussion should be framed in terms of multiple parameters θ1, θ2, etc., or in terms of a θ vector. But it seems that, as it is currently framed, θ is just one parameter, say μ, the population mean. So why do we need to "for all" over a set of one? --Ryguasu 14:33 Dec 10, 2002 (UTC)

I mean for all values of theta. B depends on the estimator (function of data) but also on the theta we estimate. Patrick 14:39 Dec 10, 2002 (UTC)
Oh. I thought the θ in the expression for B was the true value of the population parameter, not an estimate thereof. Is this incorrect? It seems like it could be useful defined this way, if you happened to know what the population was. --Ryguasu 15:05 Dec 10, 2002 (UTC)
Yes, θ is the true value of the population parameter, but you don't know it, otherwise you don't have to estimate it. Without knowing it you design a procedure (the estimator) to compute an estimate from the data; for a fixed θ the data depend on chance, hence also the resulting estimate. If the expected value of this estimate is the actual value, and this holds for all θ, the estimator is unbiased. Patrick 20:31 Dec 10, 2002 (UTC)

"For all θ" is absolutely necessary. The point is that you must be able to know that the expected value of the estimator is θ without knowing the value of θ. Michael Hardy 19:48 Feb 12, 2003 (UTC)


"For all θ" is absolutely UNECESSARY.

I have removed it. We start by picking one and only population parameter theta that we want to estimate. In this context, talking about other things we might want to estimate makes no sense to me. For example, if the parameter is "average height of all people," then it makes no sense to also look at "average age of all people." Once we fix the theta, fix the thing we are estimating, then we can have an (uncountably) infinite number of estimators to estimate theta. An estimator is a function from a sample-space to a set of estimates. Suppose we pick one (and only one) estimator of theta, and call it theta-hat. For this particular theta, and this particular theta-hat, there are a collection of estimates theta-hat(s), where s is a sample. The MSE, variance, and bias (of the estimator theta-hat) are "functors." A functor is just a technical name for a "function of a function". That is, we never speak of the "bias of an estimate," we speak of the "bias of an estimator." The MSE, variance, and bias depend only on the parameter theta, the probabilities of selection for each sample, and the estimate for each sample. In all computations, the parameter theta is constant. So speaking of "for all theta" makes no sense at all.

Also, I don't know why above somebody says the expected value of theta-hat has to be theta. That only occurs if the estimator is unbiased. Most estimators used in real world applications are biased, some grossly biased (it amazes me what people who publish surveys get away with; that is, when you see a published estimate, always ask about survey errors. Always. Don't trust publishers to screen their estimators adequately for reliabilty. Demand this info!)

-- Submitted by a mathematician actively employed in survey design research, on April 18, 2007


The above anonymous alleged "mathematician actively employed in survey design research" is horribly confused. "For all θ" does not mean "for all parameters that we might want to estimate"; it means for all values of this one parameter θ in the parameter space. It is necessary that the expected value remain equal to θ as the value of θ changes in order for the estimator to be unbiased. Michael Hardy 22:55, 20 May 2007 (UTC)
A simple example: Suppose
X = \begin{cases} 1 & \mbox{with probability }p^2, \\ 0 & \mbox{with probability }1 - p^2. \end{cases}
Based on the data X we want to estimate p. The estimator will have to be a function of the data X, whose only possible values are 0 and 1. Call the estimator g(X). Now if p happens to be equal to either 0 or 1, the the function g defined by g(x) = x satisfies the identity
 E(g(X)) = p.\qquad\qquad (1)\,
But if p is ANY OTHER NUMBER between 0 and 1, then (1) is not satisfied. It is ONLY if (1) is satisfied FOR ALL p between 0 and 1 that g(X) is an unbiased estimator of p. Consider another function, defined by g(0) = 1/4 and g(1) = 5/4. Then E(g(X)) = p if p = 1/2 (calculate it and see). Does the fact that E(g(X)) = p mean g(X) is an unbiased estimator of p? No, it does not, unless the parameter space contains only the single number, 1/2. And in that case, there's no point in estimating p based on the data, since p would be KNOWN to be 1/2. If p is any other number between 0 and 1, then E(g(X)) is NOT equal to p, so g(X) is a BIASED estimator of p. It's only if E(g(X)) remains equal to p for ALL p ∈ [0, 1] that g(X) would be an unbiased estimator of p. Michael Hardy 23:24, 20 May 2007 (UTC)

Thanks for making your position clear. Your remarks have led me to realize a shortcoming of the article. Yet this shortcoming is not exactly what you are arguing.
I still have the same stand as before. Given a paramter theta (a fixed number), and an estimator theta-hat, the bias of the estimator is the expected value of this estimator, minus the value of theta. This statement is ALL that is needed. The problem you allude to, in some regard, is NOT with this statement.
The problem is with the definition of "estimator." Specifically, it was stated in some fashion above (I may have done this, I can't recall) that an "estimator" was a function mapping samples (or outcomes of a random variable) to numbers. For example, if the sample was s, then the estimator theta-hat mapped s to the estimate, theta-hat of s. Or, if the sample space (or set of outcomes) was two points {a,b}, and the estimator was g, defined by g(a) = 0 and g(b) = 1, then the estimator g mapped the samples s to the estimate-space {0,1}.
I now realize a flaw, which your remarks helped me see. I left out the information on the PROBABILITY distributions. An "estimator" is really a multidimensional function mapping a "sample-design" ( S , Pr ) to a set of estimates, where S is a set of outcomes {s}, and Pr is map of S to the interval [0,1], such that the sum or integral of Pr over S is 1. So if I assume this defintion of "estimator", then for your example above, there is a SEPARATE estimator g(p) for EACH value of p. So when I speak of the bias of the estimator, I'm speaking of the bias using one function g and ONE set of probabilties (defined by how you chose p).
Still, you should not say "the estimator is unbiased if its expected value is theta for all values of theta." If you were going to go this route, I'd think you would want to make some statement related to all sample designs, all outcome-probabilties, as well (such as all g and all p). If we use my tighter defintition of estimator, then what you are referring to is actually a "class of estimators" (like y = x + b is a class of lines). I suppose we could define properties of such classes. But the MSE, variance, and bias is a function of (S,Prob,theta). Each p in your example yields its own parameter theta (in this case, p), expected value, bias, and variance. These latter three only relate to the given p and g. The summations do not involve other values of p, or other types of g.
Now I need to decide how to change the definition of "estimator."

—Preceding unsigned comment added by 146.142.66.80 (talk) 20:56, 10 March 2008 (UTC)

Is "The standard deviation of θ is also called the standard error of θ" true? I would have guessed that the first was the square root of V(θ) and the second the square root of MSE(θ), which would be different if θ is biased, but I am happy to be enlightened. --Henrygb 17:18, 5 Aug 2004 (UTC)

Nop, the standard error is the SD divided by sqrt of N.



The definition of the MSE (MSE(θ) = E[(θ − θ)]) seems quite unclear : what does the second θ stand for ?


I just wanted to mention that for an unbiased estimator, the MSE IS the Variance. This is important and the article neglects this (though it is obvious from property 5) ) and indeed seems to imply the opposite in the section titled "Efficiency". Also,I don't know the protocal on one discussion refering to another, but regarding the post above this, the two thetas are the estimator and the value of the parameter. One is a statistic, the other is just a real number.

[edit] Cleanup is badly needed

I find it difficult to be patient with the person who thinks that being pointlessly abstract in such a way that one can understand the article only by paying close attention to details not relevant to the topic constitutes "rigor". Michael Hardy 00:26, 19 November 2005 (UTC)

[edit] Politeness & Rigor

Hi there,

One can give an intuitive definition of what an estimator is and one can work with estimators in most cases without knowing precisely what one is talking about. But I don't think that this practical approach should exclude a more complete one. It's not because you don't use the words "probability space" or "measure space" that you don't refer to them implicitly.

Statistics is both very practical and very abstract: as it deals with all sorts of real-life situations, it has to have the mathematical tools to do it well and, like it or not, these tools involve a lot of probability and measure theory. I'm not saying that one should include a whole course on measure theory in each statistic article (and I have a tendency to do that, I must admit). What I'm saying is that at least somewhere in the article (arguably not at the beginning), one should give a very clear and precise definition of the mathematical beings that we're dealing with. Your presentation is sufficient for most applications, but for someone needing "something more" (and I was such a person), it's not: I think it almost treacherous to give the illusion of simplicity: there's a reason we don't learn about estimators in high school...

Besides, you took out the sections on Bayes estimators and minimax estimators (admittedly, not written yet - but at least the name was there somewhere). You actually deleted the paragraph on the asymptotic value of an estimator, to which I refer in the article on robust statistics. Just because you don't like/understand something doesn't mean (a) that it's wrong and (b) that it doesn't exist. I totally agree that my presentation is probably not the best possible one, but simply annihilating my work is definitely not improving things and is closer to an act of vandalism than to a scientific approach. You say my way of writing is "absurd", which means it doesn't make any sense. A more constructive approach than simply hitting the delete key would be to point out the things that don't make any sense (to you): if I made a mistake, I'll be very thankful if someone (e.g. you) tells me. I don't consider saying that an estimator is a function and specifying the sets on which it operates instead of a hand-wavy explanation to be a mistake, by the way.

I understand that an encyclopedia is not the place for a full treatment of a subject, but I think it should be used as a reference and therefore have the exact definitions somewhere in its articles. My presentation was probably clumsy, but I'm confident the maths in it wasn't: why remove all my additions? I didn't remove anything you wrote... Tell me what you think.

Regards,

Deimos.

One can give an intuitive definition of what an estimator is and one can work with estimators in most cases without knowing precisely what one is talking about.

You miss the point. It is nonsense to think that if someone does not know one particular way of formalizing something, then they don't know what they're talking about. Set theorists encode all of mathematics within set theory, but that doesn't mean a mathematician who does not know how an operator on a Hilbert space is encoded within ZFC "doesn't know precisely what he's talking about" or is not rigorous.

one should give a very clear and precise definition of the mathematical beings that we're dealing with

And that's exactly what you are not doing. Michael Hardy 21:44, 20 November 2005 (UTC)

PS: You are seriously deluded if you think "rigor" is what this is about. I'm changing the section heading.


Thank you for changing the title of my section: you could've also changed the message itself to set it more to the liking of Michael Hardy - oh sorry, you already did it... Politeness is also the theme so I changed the title of *my* message to what it is now... If you don't like what somebody is telling you, then don't read it: I find it highly dishonest to change the content of my message (even if it's only the title). If you feel like replying, create another message. My title was *not* "Deimos' editing style": had I wanted it to be, I think I could've found the right keys to press myself. If you think my title isn't correct, say so. If I agree, I'll change it.

You might be able "encode" the whole of mathematics using set theory (although I don't quite see what that would look like nor how one would go about it), but usually, when dealing with an arithmetic concept, you use the formalism of arithmetics, when dealing with an algebraic one, that of algebra and so forth. What you might be trying to say is that sometimes, when you use a geometrical notion (for example) in, say, statistics, you might adapt the notations slightly to be coherent with the rest of the statistics world. But in statistics itself, we deal with measure spaces, samples, etc. and all the lecturers I have encountered use the same definitions.

I know that stuff about estimators already: you're not doing me a favour by accepting the changes, I'm doing *you* a favor in giving you the commonly accepted definition. Delete my post if you like - it's no big deal for me. Besides, it'll still be in the database somewhere anyway. I first thought you might have point, but as I still don't see it, I'm starting to lose hope.

Deimos.

[edit] Merge with Estimation theory

See Talk:Estimation theory. Cburnett 18:28, 9 February 2006 (UTC)


[edit] Statistics versus signal processing

The big problem with Estimation theory is that it is very much focussed on Estimation Theory as it is understood in engineering esp. Signal Processing. There is also a mathematical science called Statistics which treates Estimation (and hence Estimators), Testing (and hence Statistical Tests), and so on. In principle Statistics is applicable in medicine, biology, physics, social science, economics, .... engineering ... law, sport, consumer studies ... . The page on Estimator about which there is discussion above is an example of the topic seen from Statistics. Obviously people from engineering will hardly recognise that it's all, in principle, about the same thing, and vice versa.

The subject of Estimation Theory is: construction, design, evaluation of Estimators! So one hardly needs two different pages with those two titles. I suppose that Interval Estimation is also part of estimation theory, while presently it is only treated under Estimators and not under Estimation !!!

I think there should be a general page on Estimation Theory with subtopics on Estimation theory in engineering etc.. as far as these subfields cannot identifiy themselves with the broad topic. So I agree there should be a merge but then there must be a subtopic on Estimation in Engineering esp. Signal Processing Gill110951 08:13, 10 December 2006 (UTC)