Talk:Natural language processing

From Wikipedia, the free encyclopedia

Contents

[edit] Merge

I think that this article should probably be merged with Computational linguistics, but I'm fairly new to the Wikipedia, so I'm not sure.

Lambda 22:55, 22 Feb 2004 (UTC)

While they're related, they're not really the same thing. Computational linguistics tries to use computer techniques to better understand linguistics as a discipline, while NLP tries to build ways for a computer to understand language. Obviously many things overlap, but they have much different focus: NLP doesn't explicitly care if it's making new contributions to linguistics, and computational linguistics doesn't explicitly care if it's making it easier for computers to understand natural languages. --Delirium 22:58, Feb 22, 2004 (UTC)

My take on this (I'm a grad student studying NLP/CL) is that CL and NLP are the endpoints on a continuum, and so a lot of work in the middle is hard to classify as one or the other. They don't have separate conferences - the Association for Computational Linguistics (annual) and Computational Linguistics (biannual) are the main conferences for both NLP and CL research. 24.59.194.44 13:26, 23 June 2006 (UTC)

I agree -- we should merge. Whether you call it NLP or CL is mostly a question of what aspect you stress. In addition, my impression is that the NLP tendency is currently stronger than the CL tendency in the field. Articles in the Computational Linguistics journal, and at the Coling and ACL conferences, are judged on whether they are useful rather than on whether they give any insight on how humans process language.Kallerdis (talk) 19:35, 29 February 2008 (UTC)

[edit] Content from The Natural Language Processing

I append the content from that page, in case anyone wants to merge it in here.

Charles Matthews 09:35, 6 May 2004 (UTC)

The Natural Language Processing

Natural Language Processing (NLP) is inside the topic of the Artificial Intelligence and linguistics. It treats the problems inherent in the processing and manipulation of natural language.

Some examples of the major tasks in Natural Language Processing are:

  • Text to speech
  • Speech Recognitions
  • Natural language generation
  • Translation made by Machine
  • Question answering
  • Information retrieval
  • Information extraction
  • Text-proofing

Some problematic things in NLP are:

Word boundary detection

In the known spoken language, there are no gaps between words; where to situate the word boundary many times depends on what choice makes the most sense grammatically and given the context.


Word sense disambiguation

Any word that we can think of has many different meanings. That is why, we have to select the meaning which makes the most sense in our context.


– Sign

Syntactic ambiguity

The grammar for natural languages is ambiguous. Selecting the most appropriate grammatical element requires semantic and contextual information.


Speech acts and plans

Sometimes what we write doesn't mean literaly what is written; for instance a good answer to "Can you give the pencil?" is to give the pencil; in most contexts "Yes" is not the best thing to answer; when you want to say literaly "No" it is better to say "I'm afraid that I can't see it".


Question edited into the article by User:129.27.236.115:

The Morphix-NLP link is not valid anymore. Does anybody know where to get Morphix-NLP?

Cadr

It is now. Yaron 22:40, May 17, 2004 (UTC)

[edit] Remove external link

Removed a spam link (several times) to a website called ivrdictionary. This is a thinly veiled attempt to put advertising on Wikipedia. Links were added by several anonymous users within a tight IP range. Website purports to list ivr terminology, but in reality it prominently displays an advertisement to Angel dot com, which is a commercial company that sells IVR related products. The same links were added to other articles that are related to IVR technology. Calltech 16:59, 17 November 2006 (UTC)


[edit] Incorporate stemming?

I suggest adding a link to stemming in the see also or subtasks or challenges. I am not sure who is responsible for editing this article though, and I don't want to edit it myself without asking. Is stemming too detailed, or a subtask of another subtask only like IR? Not sure. I thought it was a pretty popular problem. Josh Froelich 19:46, 13 December 2006 (UTC)

[edit] External links

I think everyone would agree the external links section is a complete mess and full of spam, vanity links, and other links that don't add anything to the article. I count 47 external links. I'm sure there is someone out there who supports each one, but I think we all can agree that 47 is too many and there is certainly some redundancy.

I know it can be hard to part with large chunks of an article, but I propose the following: we assume that we are going to delete all of them and anyone who wants a link kept should nominate it here on the talk page. We can then discuss whether it actually adds something unique. Please keep in mind WP:EL, also.

--Selket 22:50, 1 February 2007 (UTC)

The Implementations links seem alright. However the R & D groups links are way too many. Unfortunatly, each group would want there own link up there. Also, there were a few links to blogs. Am I right in believing that those links should be deleted?

Ummonk 22:06, 4 February 2007 (UTC)

I think the Implementations links should be removed per WP:EL, WP:SPAM, and WP:NOT#LINK --Ronz 03:06, 25 October 2007 (UTC)

[edit] Maximum entropy methods

My vague understanding is that maximum entropy methods represent the state of the art in NLP these days; yet this article seems to fail to mention them. Could an expert clarify/elucidate? linas 13:17, 13 June 2007 (UTC)

If an article is lacking a notable subject, it's usually the case that nobody got around to adding it. Please be bold and add a review of maxent NLP stuff to the article as you see fit, remembering to cite your sources. –jonsafari 20:47, 14 June 2007 (UTC)
In most subareas of current NLP, machine learning is at the core of most implementations. It's true that Maxent (or logistic regression, as it's also known) and its generalizations (e.g. Conditional random fields) usually perform well for these tasks, but they are not the only method. I'd say that margin-based methods such as Support Vector Machines are at least as popular. Anyway, it's more important to expand the section about machine learning/statistical modeling rather than just adding a section about Maxent. Kallerdis (talk) 19:43, 29 February 2008 (UTC)

[edit] Human Language Technology

Does anyone feel it necessary to distinguish between NLP and HLT? If so, please visit that article—it desperately needs work. On the other hand, perhaps it should simply redirect here to the NLP article. —johndburger 02:47, 22 June 2007 (UTC)

[edit] Papers

The following were added to the External links section. Perhaps one or more might be used as a reference someday?

  • Goutam Kumar Saha, English to Bangla Translator: The BANGANUBAD, International Journal -CPOL, Vol.18(4), pp.281-290, December 2005, WSPC, USA.
  • Goutam Kumar Saha, Parsing Bengali Text - an Intelligent Approach, ACM Ubiquity, Vol. 7 Issue 13, April, 2006. ACM Press, USA.
  • Goutam Kumar Saha, The EB-ANUBAD Translator: A Hybrid Scheme, International Journal ZUS, Vol. 6A(10), ZUS Press, 2005.
  • Goutam Kumar Saha, A Novel 3-Tier XML Schematic Approach for Web Page Translation, ACM Ubiquity, Vol. 6(43), ACM Press, 2005, USA.

--Ronz 17:36, 14 November 2007 (UTC)

[edit] Add confusion about accenting words?

I was going to add this in, but I thought it might not be a good Idea. If you guys can incorporate it well and fit it in, please do: (I was going to put it after the 'I never said she stole my money' part.) Accenting words can be very helpful in giving meaning to a sentence that contains negatives, because the speaker is saying that a specific fact is not true, and usually something else without one expressed specific is. Sometimes accenting words in a sentence can still lead to confusion, like in "Go over there" because "over" is being used to describe the relative position of the destination, but when taken by itself, "over" means ontop of something. The accent in this case implies a literal meaning of the word...

24.250.97.223 (talk) 04:56, 14 December 2007 (UTC)