Example-based machine translation
From Wikipedia, the free encyclopedia
The Example-based machine translation (EBMT) approach to machine translation is often characterized by its use of a bilingual corpus with parallel texts as its main knowledge base, at run-time. It is essentially a translation by analogy and can be viewed as an implementation of case-based reasoning approach of machine learning.
At the foundation of example-based machine translation is the idea of translation by analogy. When applied to the process of human translation, the idea that translation takes place by analogy is a rejection of the idea that people translate sentences by doing deep linguistic analysis. Instead it is founded on the belief that people translate firstly by decomposing a sentence into certain phrases, then by translating these phrases, and finally by properly composing these fragments into one long sentence. Phrasal translations are translated by analogy to previous translations. The principle of translation by analogy is encoded to example-based machine translation through the example translations that are used to train such a system.
English | Japanese |
---|---|
How much is that red umbrella? | Ano akai kasa wa ikura desu ka. |
How much is that small camera? | Ano chiisai kamera wa ikura desu ka. |
Example-based machine translation systems are trained from bilingual parallel corpora, which contain sentence pairs like the example shown in the table. Sentence pairs contain sentences in one language with their translations into another. The particular example shows an example of a minimal pair, meaning that the sentences vary by just one element. These sentences make it simple to learn translations of subsentential units. For example, an example-based machine translation system would learn three units of translation:
- How much is that X ? corresponds to Ano X wa ikura desu ka.
- red umbrella corresponds to akai kasa
- small camera corresponds to chiisai kamera
Composing these units can be used to produce novel translations in the future. For example, if we have been trained using some text containing the sentences:
President Kennedy was shot dead during the parade. and The convict escaped on July 15th. We could translate the sentence The convict was shot dead during the parade. by substituting the appropriate parts of the sentences.
Other approaches to machine translation, including statistical machine translation, also use bilingual corpora to learn the process of translation.
Example based machine translation was first suggested by Nagao Makoto in 1984.[1] It soon attracted the attention of scientists in the field of natural language processing.
[edit] See also
[edit] External links
- ^ Makoto Nagao (1984). "A framework of a mechanical translation between Japanese and English by analogy principle", in A. Elithorn and R. Banerji: Artificial and Human Intelligence. Elsevier Science Publishers.
Approaches to Machine translation |
---|
Dictionary-based · Rule-based (RBMT) · Transfer-based · Statistical (SMT) · Example-based (EBMT) · Interlingual |