Computer-assisted translation
From Wikipedia, the free encyclopedia
Computer-assisted translation, computer-aided translation, or CAT is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process.
Computer-assisted translation is sometimes called machine-assisted, or machine-aided, translation.
Contents |
[edit] Computer-assisted translation vs. Machine translation
Although the two concepts are similar, computer-assisted translation should not be confused with machine translation (MT).
In computer-assisted translation, the computer program supports the translator, who translates the text himself, making all the essential decisions involved, whereas in machine translation, the translator supports the machine, that is to say that the computer or program translates the text, which is then edited by the translator, or not edited at all. Difficulties with such unedited output are described at machine translation.
[edit] Overview
Computer-assisted translation is a broad and imprecise term covering a range of tools, from the fairly simple to the more complicated. These can include:
- Spell checkers, either built into word processing software, or add-on programs;
- Grammar checkers, again either built into word processing software, or add-on programs;
- terminology managers, allowing the translator to manage his own terminology bank in an electronic form. This can range from a simple table created in the translator's word processing software or spreadsheet, a database created in a program such as FileMaker Pro or Alpha Five, or, for more robust (and more expensive) solutions, specialized software packages such as LogiTerm, MultiTerm, Termex, etc.
- Dictionaries on CD-ROM, either unilingual or bilingual
- Terminology databases, either on CD-ROM or accessible through the Internet, (such as The Open Terminology Forum, TERMIUM or Grand dictionnaire terminologique from the Office québécois de la langue française)
- Full-text search tools (or indexers), which allow the user to query already translated texts or reference documents of various kinds. In the translation industry one finds such indexers as Naturel, ISYS Search Software and dtSearch.
- Concordancers, which are programs that retrieve instances of a word or an expression in a monolingual, bilingual or multiligual corpus.
- Bitexts, a fairly recent development, the result of merging a source text and its translation, which can then be consulted using a full-text search tool.
- Translation memory managers (TMM), tools consisting of a database of text segments in a source language and their translations in one or more target languages.
- Systems that are nearly automatic as in machine translation, but allow user decisions for ambiguous cases. These are sometimes called human-aided machine translation.
[edit] Translation memory software
Translation memory (TM) programs store previously translated source texts and their equivalent target texts in a database and retrieve related segments during the translation of new texts.
Such programs split the source text into manageable units known as "segments." A source-text sentence or sentence-like unit (headings, titles or elements in a list) may be considered a segment, or texts may be segmented into larger units such as paragraphs or small ones, such as clauses. As the translator works through a document, the software displays each source segment in turn and provides a previous translation for re-use, if the program finds a matching source segment in its database. If it does not, the program allows the translator to enter a translation for the new segment. After the translation for a segment is completed, the program stores the new translation and moves onto the next segment. The translation memory, in principle, is a simple database fields containing the source language segment, the translation of the segment, and other information such as segment creation date, last access, translator name, and so on.
Some translation memory programs function as standalone environments, while others function as an add-on or macro to commercially available word-processing or other business software programs. Add-on programs allow source documents from other formats, such as desktop publishing files, spreadsheets, or HTML code, to be handled using the TM program.
[edit] Terminology management software
Terminology management software provides the translator a means of automatically searching a given terminology database for terms appearing in a document, either by automatically displaying terms in the translation memory software interface window or through the use of hot keys to view the entry in the terminology database. Some programs have other hotkey combinations allowing the translator to add new terminology pairs to the terminology database on the fly during translation.
[edit] Alignment software
Alignment programs take completed translations, divide both source and target texts into segments, and attempt to determine which segments belong together in order to build a translation memory database with the content. The resulting TM can then be used for future translations.
[edit] Comparison of different CAT tools
(Alphabetical order, free software first, proprietary solutions second.)
Tool | Supported File Formats | OS | Price | License |
---|---|---|---|---|
ForeignDesk | HTML, C Source Files, Java Source Code, Microsoft WinHelp File Sources (HPJ, HHC, HHK, CNT), Trados, |
Windows | BSD License, IBM Public License | |
Okapi Framework | PO, Windows RC, TMX, Wordfast, Trados, Java Properties, Regular-expression-based text, Illustrator, INX, ResX, Table-type files, XML | Windows (.NET) | LGPL | |
OmegaT® | HTML, XHTML, DocBook, Plain Text, PO, JavaHelp, Java Resource Bundles, OpenDocument (ODF), OpenOffice.org, StarOffice, HTML Help Compiler (HCC), INI files | Multiplatform (Java) | GPL | |
OmegaT+ | HTML, XHTML, Plain Text, Java Resource Bundles, OpenDocument (ODF), OpenOffice.org, StarOffice. OmegaT+ is a fork of an obsolete version of OmegaT® and its developer has refused to give up the use of the OmegaT registered name. | Multiplatform (Java) | GPL | |
Transolution | HTML, StarOffice/Openoffice.org, XLIFF, DOCBOOK |
Multiplatform (Python) | GPL | |
AidTransStudio | OpenOffice.org, MS Word Xml, HTML, Plain Text, XML, Trados TTX, Custom Format (config based on Regular Expressions) |
Windows (.NET) | Basic Edition: Free, Pro and Ent See price list | Proprietary |
Cafetran | HTML, XML, OpenOffice.org, AbiWord, Kword, MS Word |
Multiplatform (Java) | 180 Euro | |
Déjà Vu (DVX) | XML, Plain Text, OpenOffice.org, Adobe FrameMaker, Adobe PageMaker, ASP, Interleaf/Quicksilver, InDesign, Help Content, SGML, MS Access, MS Excel, MS PowerPoint, MS Word, QuarkXPress, RTF, Resource files, C/C++/Java source files, Java Properties, JavaScript, VBScript, GNU gettext | Windows | 490 Euro | Proprietary |
Heartsome Translation Suite | HTML/XHTML, XML, Plain Text, OpenOffice.org, StarOffice, AbiWord, PO/POT (GNU Gettext), SVG, Adobe FrameMaker (MIF), Adobe InDesign, DocBook, DITA, Java Properties, JavaScript, RTF, Tagged RTF, Trados TTX, MS Office 2003 XML, ResX (Windows .NET Resources), RC (Windows C/C++ Resources), MS Office 2007 (beta) | Multiplatform (Java) | See price list. | Proprietary |
LogiTerm | ? | ? | ? | ? |
MemoQ | HTML, plain text, MS Word (plus RTF and other Word documents), MS Excel, MS PowerPoint, Trados TTX, Adobe FrameMaker, proprietary bilingual format, XML (only for translation memories) | Windows | 4Free: Freeware Translator Pro: 390 Euro LSP 5: 1490 Euro |
Proprietary |
MetaTexis | HTML, XML, Resource files MS Word (all kinds of text files that can be imported by MS Word), MS Excel, MS PowerPoint, Adobe FrameMaker, Adobe PageMaker, QuarkXPress |
Windows | Lite: 29 Euro Pro: 79 Euro .NET/Office: 109 Euro |
Proprietary |
MLTS | ? | ? | ? | |
MultiCorpora MultiTrans | ? | ? | ? | |
MultiLing Fortis Translation Suite | ? | ? | ? | |
Pootle | Gettext PO, XLIFF, OpenOffice GSI files (.sdf), TMX, TBX, Java Properties, DTD, CSV, HTML, XHTML, Plain Text | Multiplatform (Python) | GPL | |
ppp.helper | MS PowerPoint | Windows | 39 Euro | Proprietary |
Rainbow | HTML, XHTML, Scripts, Photoshop, etc. |
Windows (.NET) | Freeware | Proprietary |
SDLX | ? | ? | ? | |
STAR Transit | Text ANSI / ASCII / Unicode for Windows, Text for Apple Macintosh, Corel WordPerfect, HTML,
XML (ASP.NET, ASP, JSP, XSL), SGML, SVG (Scalable Vector Graphics), MS Word for Windows, MS Excel, MS PowerPoint, RTF y RTF for WinHelp, RC, QuarkXPress, Adobe FrameMaker, Adobe PageMaker, Interleaf /Quicksilver, Adobe InDesign, XGate para QuarkXPress, AutoCAD |
Windows | Proprietary | |
SDL Trados 2006 | When it is installed, it automatically adds a template to MS word. Additional filter for translating with TagEditor available: Word, Excel, PowerPoint, OpenOffice, InDesign, QuarkXPress, PageMaker, Interleaf, Framemaker, HTML, SGML, XML, SVG, .... | Windows | New Freelance version, approx. 400 - 800 EUR | |
Tr-aid | ? | ? | ? | |
Wordfast | MS Word | Microsoft Office Word macro | 180 Euro | Proprietary |
WordFisher | MS Word | WordBasic\Ms Office Word macro | Free Licence | |
Similis | HTML, PDF, Word, Trados |
Windows | 295 Euro (monoposte) | Proprietary |
Open Language Tools | HTML/XHTML, XML, DocBook SGML, ASCII, StarOffice/OpenOffice/ODF, .po (gettext), .properties, .java (ResourceBundle), .msg/.tmsg (catgets) | Multiplatform (Java) | Free | CDDL |
POedit software | Multiplatform ([[]]) | Free | [[]] | |
File Formats | OS | Price | License |
[edit] See also
[edit] External links
[edit] CAT Discussion groups
[edit] Software localization tools
[edit] Translation memory packages
- AidTrans Studio
- Cafetran
- Déjà Vu
- Heartsome Translation Suite
- LogiTerm
- MetaTexis
- MLTS
- MultiCorpora MultiTrans
- MultiLing Fortis Translation Suite
- OmegaT®
- OmegaT+
- ppt.helper
- SDLX
- STAR Transit
- Trados Translators Workbench
- Tr-aid
- Wordfast
- WordFisher
- Similis second generation translation memory