Text simplification

From Wikipedia, the free encyclopedia

Text simplification is an operation used in natural language processing to modify, enhance, classify or otherwise process an existing corpus of human-readable text in such a way that the grammar and structure of the prose is greatly simplified, while the underlying meaning and information remains the same. Text simplification is an important area of research, because natural human languages ordinarily contain complex compound constructions that are not easily processed through automation. In terms of reducing language diversity, semantic compression can be employed to limit and simplify a set of words used in given texts.

Example

Text Simplification is illustrated with an example from Siddharthan (2006). The first sentence contains two relative clauses and one conjoined verb phrase. A text simplification system aims to simplify the first sentence to the second sentence.

“The ability to simplify means to eliminate the unnecessary so that the necessary may speak”

  • Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might hold.
  • Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out today.

See also

External links

References

  • Advaith Siddharthan. "Syntactic Simplification and Text Cohesion". In Research on Language and Computation, Volume 4, Issue 1, Jun 2006, Pages 77--109, Springer Science, the Netherlands.
  • Siddhartha Jonnalagadda, Luis Tari, Joerg Hakenberg, Chitta Baral and Graciela Gonzalez. Towards Effective Sentence Simplification for Automatic Processing of Biomedical Text. In Proc. of the NAACL-HLT 2009, Boulder, USA, June.
This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.