Speech recognition in Linux
From Wikipedia, the free encyclopedia
This article or section has multiple issues. Please help improve the article or discuss these issues on the talk page.
|
There is currently no open-source equivalent of proprietary speech recognition software (e.g. Nuances Dragon NaturallySpeaking) for Linux. However, there are several incomplete, open-source projects and solutions that could be used to attain some elements of speech recognition in the free operating system.
Contents |
[edit] History
In the late 1990s, a Linux version of ViaVoice (created by IBM) was made available to users for no charge. However, the free SDK was later removed by the developer in 2002.
[edit] Current development status
It is possible to use programs such as Dragon NaturallySpeaking 9 in Linux by utilizing Wine (a Windows compatibility layer for Linux), though some problems will arise. WinDictator also allows the use of Windows dictation software (running on a real or virtual Windows machine) with Linux, but installing it may require advanced skills. On Linux it is possible to run DNS in a virtual machine[1], although problems (such as sound input errors) may occur.
Recently, there has been a push to get a high-quality native Linux speech recognition engine developed. As a result, numerous projects dedicated to creating Linux speech recognition solutions (that are equivalent to current Windows solutions) were established. One major hurdle is the compilation of a speech corpus to enable production of acoustic models. In response, VoxForge, which aims to collect transcribed speech for the use with free and open-source speech recognition engines under the GPL license, was set up.
Ubuntu is currently gathering ideas for implementing speech recognition.[2].
[edit] Solutions
The following is a list of current projects dedicated to implementing speech recognition in Linux, as well as major (though mostly incomplete) native solutions that are available as of March 2008:
- VoxForge
- Julius
- CMU Sphinx
- HTK (copyrighted by Microsoft, though source code is available for personal use)
- Xvoice (requires ViaVoice to function)
- Open Mind Speech
- Flite (This is a C version of Festival and is a speech synthesis engine, not a speech recognition engine)
- GnomeVoiceControl
- Simon (This project aims at helping blind people; requires Julius)
It is possible (though complicated) for advanced developers to create his/her own Linux speech recognition solution using existing packages derived from open-source projects.