Speech recognition in Linux

From Wikipedia, the free encyclopedia

This article or section has multiple issues. Please help improve the article or discuss these issues on the talk page.

It is in need of attention from an expert on the subject. may be able to help recruit one.
It needs to be expanded. Tagged since March 2008.

There is currently no open-source equivalent of proprietary speech recognition software (e.g. Nuances Dragon NaturallySpeaking) for Linux. However, there are several incomplete, open-source projects and solutions that could be used to attain some elements of speech recognition in the free operating system.

1 History
2 Current development status
- 2.1 Solutions
3 See also
4 References
5 External link

[edit] History

In the late 1990s, a Linux version of ViaVoice (created by IBM) was made available to users for no charge. However, the free SDK was later removed by the developer in 2002.

[edit] Current development status

It is possible to use programs such as Dragon NaturallySpeaking 9 in Linux by utilizing Wine (a Windows compatibility layer for Linux), though some problems will arise. WinDictator also allows the use of Windows dictation software (running on a real or virtual Windows machine) with Linux, but installing it may require advanced skills. On Linux it is possible to run DNS in a virtual machine^[1], although problems (such as sound input errors) may occur.

Recently, there has been a push to get a high-quality native Linux speech recognition engine developed. As a result, numerous projects dedicated to creating Linux speech recognition solutions (that are equivalent to current Windows solutions) were established. One major hurdle is the compilation of a speech corpus to enable production of acoustic models. In response, VoxForge, which aims to collect transcribed speech for the use with free and open-source speech recognition engines under the GPL license, was set up.

Ubuntu is currently gathering ideas for implementing speech recognition.^[2].

[edit] Solutions

The following is a list of current projects dedicated to implementing speech recognition in Linux, as well as major (though mostly incomplete) native solutions that are available as of March 2008:

VoxForge
Julius
CMU Sphinx
HTK (copyrighted by Microsoft, though source code is available for personal use)
Xvoice (requires ViaVoice to function)
Open Mind Speech
Flite (This is a C version of Festival and is a speech synthesis engine, not a speech recognition engine)
GnomeVoiceControl
Simon (This project aims at helping blind people; requires Julius)

This list is incomplete; you can help by expanding it.

It is possible (though complicated) for advanced developers to create his/her own Linux speech recognition solution using existing packages derived from open-source projects.