Talk:Stemming

From Wikipedia, the free encyclopedia

This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.

This should not redirect. This should be the article. User:Jfroelich

The section on matching algorithms is unreadable.. —Preceding unsigned comment added by 129.170.66.210 (talk) 20:24, 7 September 2007 (UTC)

Agree, the section on matching algorithms is unreadable... I'd edit it, but it isn't clear what the point of the section is.
I've cleaned it up. However, I'm not an expert in that field. I did my best to figure out what was really meant in that paragraph and rewrite it using good English, style, explanations and examples, but it would be nice if someone who really knows something about matching algorithms reviewed this section. 89.138.151.18 (talk) 10:34, 27 May 2008 (UTC)

A few other points, if someone has the time to edit:

1) There appears to be no mention of 'suffix substitution', this is common in most stemmers that also do suffix stripping, for example in English substitute -ies for -y (as in lady and ladies); the method can be extended to substitute irregular verbs such as ran for run etc.

2) The list of languages is a bit pointless, many commercial products have stemming for dozens of languages e.g. Verity, dtSearch.

3) Google as a commercial example is rather poor I think, many other commercial companies have been using stemming techniques for decades. Ray3055 20:27, 27 September 2007 (UTC)

"On the other hand, stemmers for true isolating languages such as Vietnamese can be even simpler than those for English." Removed this, Vietnamese has no verb inflection, and no noun declension - hence stemming does not apply. The link to the Vietnamese wikipedia article is confusing it uses the term "..is an analytic (or isolating) language" implying the terms mean the same - but see the two separate articles.Ray3055 12:03, 14 November 2007 (UTC)

I changed the number "one" to "two" under "Hybrid approaches". This is my first edit in Wikipedia ever. I welcome any comments that may help me be a better editor. Lon of Oakdale (talk) 18:05, 14 April 2008 (UTC)