Talk:Linear discriminant analysis

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

This article is within the scope of the Business and Economics WikiProject.
??? This article has not yet received a rating on the assessment scale.
??? This article has not yet received an importance rating on the assessment scale.

---
Note: Use of discriminant functions is not a type of generative models as this article suggests. Probabilities do not always play a role in linear discriminant analysis especially when using least squares. I suggest to the author of this article not to refer to LDA as a generative model.
Elkhiyarih 17:57, 15 November 2007 (UTC)
---

changed 'nmachine learning' to statitics -- FLD was invented and used my statisticians a long time before all that ML nonsense!

---

That's true, but the wording said that it's currently used (rather than was developed in) the area called machine learning, so it was not an incorrect statement (not that I'm particularly bothered by the change, but a reader looking for related techniques would be better served by being referred to machine learning than to statistics in general).

BTW: I notice two references by H.Abdi have been added by user 129.110.8.39. Looking at this user's other edits, it seems as though a lot of other statistics based articles have been edited to refer to these references, leading me to believe that this is the author trying to publicise his/her books. Is there a wikipedia policy on this situation? My gut reaction would be to remove all of the references he added.

--Tcooke 02:30, 13 October 2006 (UTC)

---

A few questions I had while learning about this technique that could be addressed here:

  • What is the significance of the word discriminant in this technique?
  • What about this technique is linear?

The problem to be solved is the discrimination between two classes of objects/events based on a number of measurements. The discriminant is a single variable which tries to capture all of the discriminating ability of these measurements. In this case, the discriminant function is a linear combination of the measurements.

--Tcooke 12:49, 22 July 2005 (UTC)


[edit] Implementation Details

I recently implemented Fisher's linear discriminant and found that internet resources (including wikipedia) were lacking in two respects

  • finding the threshold value c
  • finding the sign of \vec{w}

Most of the examples that I saw assumed that the data was centered about the origin required a zero threshold.

My solution for finding c was to naively search for the best value for my training set. I'm sure that this approach does not give the best generalization - I would guess calculating the maximal margin would be better.

With regards to the sign;

S=\frac{\sigma_{between}^2}{\sigma_{within}^2}= \frac{(\vec w \cdot \vec \mu_{y=1} - \vec w \cdot \vec \mu_{y=0})^2}{\vec w^T \Sigma_{y=1} \vec w + \vec w^T \Sigma_{y=0} \vec w} = \frac{(\vec w \cdot (\vec \mu_{y=1} - \vec \mu_{y=0}))^2}{\vec w^T (\Sigma_{y=0}+\Sigma_{y=1}) \vec w}

does not contain any information about the direction of the separator. What is the best way find the direction when using this formulation?

Are implementation details for algorithms relevant to wikipedia articles? If so, I'm sure a short note on the page would add to its usefulness.

128.139.226.34 06:58, 7 June 2007 (UTC)

[edit] LDA for two classes

This is very well written. However, a little more definition of Σ and Σ − 1 might be nice. I realize they are mentioned as the "class covariances" but a formula or a ref would be great.

Also, the problem is stated as "find a good predictor for the class y .. given only an observation x." However, then the result is an enormous formula (QDA) or the simpler LDA. It would be nice to state the decision criterion in the original problem terms.

That is, the next-to-last sentence would be (I think!) something like: a sample x is from class 1 if p(x|y=1) or w * x < c. Maybe I'm wrong, but using the language of the original problem would be good.

dfrankow (talk) 21:49, 27 February 2008 (UTC)