Grammar checker

From Wikipedia, the free encyclopedia

In computing terms, a grammar checker is a design feature or a software program designed to verify the grammatical correctness or lack of it in a written text. A grammar checker may be implemented in a similar fashion as a spelling checker, such as a stand-alone application capable of operating on a block of text. Grammar checkers are most often implemented as a feature of a larger application, such as a word processor. Currently, there are no known instances of a grammar checker in an email client, electronic dictionary or search engine.

Unlike spelling checkers, which work by comparing each word in a document with a list of correctly spelled words, true grammar checking requires much more complex analysis. Full blown grammar checkers use Natural Language Processing, a branch of Artificial Intelligence.

Contents

[edit] History

The earliest “grammar checkers” were really programs that checked for punctuation and style problems, rather than finding many actual grammatical errors. The first system was called Writer’s Workbench, and was a set of writing tools included with Unix systems as far back as the 1970’s. The whole Writer’s Workbench package included several separate tools to check for various writing problems. The ‘diction’ tool checked for wordy, trite, clichéd or misused phrases in a text. The tool would output a list of suspect phrases, and provide suggestions for improving the writing. The ‘style’ tool analyzed the writing style of a given text. It performed a number of readability tests on the text and output their results, and it gave some statistical information about the sentences of the text.

Aspen Software of Albuquerque, NM, released the earliest version of a diction and style checker for personal computers, Grammatik, in 1981. Grammatik was first available for a Radio Shack TRS-80, and soon had versions for CP/M and the IBM PC. Reference Software of San Francisco, CA, acquired Grammatik in 1985. Development of Grammatik continued, and it became a true grammar checker that could detect writing errors beyond simple style checking.

Other early diction and style checking programs included Punctuation & Style and RightWriter. While all the earliest programs started out as simple diction and style checkers, all eventually added various levels of language processing, and developed some level of true grammar checking capability.

Until 1992, grammar checkers were sold as add-on programs. There were still a large number of different word processing programs available at that time, with WordPerfect and Microsoft Word the top two in market share. In 1992, Microsoft decided to add grammar checking as a feature of Word. Microsoft licensed a grammar checker from a company that had not yet marketed its product as a standalone product[citation needed]. WordPerfect answered Microsoft’s move by acquiring Reference Software, and the direct descendant of Grammatik is still included with WordPerfect.

Microsoft’s decision to integrate grammar checking has proven disastrous for grammar checking software in general. From 1985 until 1992, several companies that had grammar checking software were making great improvements in grammar checking ability and accuracy from year to year. Microsoft’s move essentially put an end to further development. As late as 2006, the grammar checking capabilities of Microsoft Word and WordPerfect are not significantly different than what was available in 1992.

Because of Microsoft's dominating position, it has been unrealistic for any other company to put resources into further development of grammar checkers. There are, however, a couple of open-source software projects developing grammar checking technology, including Abiword and LanguageTool (associated with OpenOffice). There have been no published studies comparing these tools with other commercial grammar checkers.

[edit] Technical Issues

The earliest writing style programs checked for wordy, trite, clichéd or misused phrases in a text. This process was based on simple pattern matching. The heart of the program was a list of many hundreds or thousands for phrases that are considered as poor writing by many experts. The list of suspect phrases included alternate wording for each phrase. The checking program would simply break a text into sentences, check for any matches in the phrase dictionary, and flag suspect phrases and show an alternative.

These programs could also perform some mechanical checks. For example, they would typically flag doubled words, doubled punctuation, some capitalization errors, and other simple mechanical mistakes.

True grammar checking is a much more difficult problem. While a computer programming language has a very specific syntax and grammar, this is not so for natural languages. While it is possible to write a somewhat complete formal grammar for a natural language, there are usually so many exceptions in real usage that a formal grammar is of minimal help in writing a computer program for a grammar checker.

One of the most important parts of a natural language grammar checker is a dictionary of all the words in the language, along with the part of speech of each word. The fact that natural words can take many different parts of speech greatly increases the complexity of any grammar checker.

A grammar checker will find each sentence in a text, look up each word in the dictionary, and then attempt to parse the sentence into a form that matches a grammar. Using various rules, the program can then detect various errors, such as agreement in tense, number, word order, and so on.

It is also possible to detect some stylistic problems with the text. For example, heavy use of passive voice is not considered to be good writing style. After a sentence has been parsed, it is possible to detect passive voice, and rewrite the sentence to present to the user as a better alternative.

The software elements required for grammar checking are closely related to some of the problems that need to be solved for voice recognition software. In voice recognition, parsing can be used to help predict which word is most likely correct based on part of speech and position in the sentence. In grammar checking, the parsing is used to detect words that fail to follow proper grammar.

[edit] See also

[edit] External links