List of speech recognition software

Open source acoustic models and speech corpus (compilation)

The following list presents notable speech recognition software engines with a brief synopsis of characteristics.

Application name Description Open Source License Operating System Programming Language Supported Language/Note
CMU Sphinx HMM Yes BSD style Multi-platform Java English
Mozilla DeepSpeech Deep neural net. Yes Mozilla Public License 2.0 Multi-platform Python English
HTK HMM No HTK Specific License Multi-platform C English. Version 3.5 released December 2015.
Julius HMM trigrams Yes BSD-like Multi-platform C Japanese, English(non-commercial)[1]
Kaldi Deep neural net. Yes Apache Multi-platform C++ English
iATROS LDA (Latent Dirichlet) Yes miss Linux C English. Currently inactive (last update 2009)
RWTH ASR RWTH Aachen University No RWTH ASR License Linux, Mac OS X ? English. Non-commercial use only
Agnitio Windows-based speech recognition program No Freeware License Windows VB.NET English

The following lists open-source applications that provide convenient user interfaces for the above.

Application name Description Open Source License Operating System Programming Language Supported Language/Note
Simon Supports Sphinx, HTK, Julius Yes GPLv2 Multi-platform C++ English
Jasper project Raspberry Pi front-end for CMU Sphinx or Julius Yes MIT License Linux Python English

Macintosh

Application name Description Open Source License Price Note
Dragon Dictate Mac OS (by Nuance)No Proprietary
MacSpeech Dictate Medical Medical dictation product
Macspeech Dictate Legal Legal-focused dictation
MacSpeech Scribe Transcription from recorded text
iListen PowerPC Macintosh
Speakable items Included with Mac OS
ViaVoice IBM Product. Purchased by Nuance.
Voice Navigator Original GUI voice control (1989)
Power Secretary[1]
Vestec Inc. ASR, NLU, TTS, VSLIC

Cross-platform web apps based on Chrome

The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API.[2]

Application name Description Open Source License Price Note
Voice Notebook Free dictation, voice typing to clipboard and to web text fields. Windows and Linux integration No Commercial Free
SpeechTexter[3] Online speech recognition and editor No Commercial Free
Speechnotes Dictation notepad - professional speech recognizing text editor web app No Commercial Free
Trint Convert audio/video to text, search and verify in online editor that glues audio to text. No Commercial From 17¢/minute
Go-Transcribe.com Cloud based transcription service No Commercial Subscription based
Happy Scribe Speech to text recognition software. No Commercial From 0.09¢/minute

Mobile devices and smartphones

Many cell-phone handsets have basic dial-by-voice features built in. Smartphones such as iPhones and BlackBerrys also support this. A number of third-party apps have implemented natural-language speech-recognition support, including:

Application name Description Open Source License Price Note
Assistant.ai Assistant for Android, iOS and Windows Phone Free
Indigo Virtual Assistant for Android, iOS, and WP, by Artificial Solutions No[4] Commercial Free
Textshark Cloud-/API-based speech-to-text transcription No Commercial $100–$1000/mo
VoiceBoost SIVA Embedded speech recognition for wake-up and phrase-spotting No Commercial
VoiceBoost SDVA Embedded Speaker Verification/Identification No Commercial
VoiceBoost ECC Embedded speech command and control— small Vocab ASR Engine No Commercial
TrulyHandsfree Embedded speech recognition for wakeup and command and control No Commercial
Vocollect Embedded speech recognition for wakeup, task, command and control No Commercial
TrulyNatural Embedded large vocabulary speech recognition for natural language No Commercial
Sonic Cloud Online Speech
S-voice Samsung Galaxy's Voice based personal assistant No Commercial
Verbio CSR Continuous Speech Recognition on premises. Multi channel ready for 8 kHz, 16 kHz and 32 kHz. Contains Deep Neural Networks and Natural Languages capabilities. English, Spanish, Japanise, French, Portuguese and Catalan
Dragon Dictation No Free
Google Now Android voice search No Free
Google Voice Search No Free
Microsoft Cortana Microsoft voice search No Free
GoVivace Cloud-based speech recognition Commercial
Siri Personal Assistant Apple's virtual personal assistant No Free
Alexa - Amazon Echo Amazon's Personal Assistant No Commercial
MeMeMe Mobile Cloud-based speech recognition
SILVIA Android and iOS No
Shoutout
Vlingo
Jeannie Android
Ziri Android
Microsoft Tellme Windows Phone 7/8 Free
Ask Ziggy Windows Phone 7
fcGlobal Terminal
Vocre iOS
Utter! Voice to speech personal assistant No Free
Vestec ASR, NLU, TTS embedded No

Windows

Windows built-in speech recognition

The Windows Speech Recognition version 8.0 by Microsoft comes built into Windows Vista, Windows 7 and Windows 8. Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows; meaning you cannot use the speech recognition engine in one language if you use a version of Windows in another language. Windows 7 Ultimate and Windows 8 Pro allow you to change the system language, and therefore change which speech engine is available. Windows Speech Recognition evolved into Cortana (software), a personal assistant found in Windows 10.

Add-ons for Windows 7 speech recognition

  • VoiceAttack - is used primarily by the gaming community to allow hands-free keyboard and mouse input in Windows 10, Windows 8, Windows 7, Windows Vista and Windows XP. Its popularity lies mainly in its ease of use and extended feature set, which includes the ability to create multi-threaded macros.[5]
  • Livrot Mic Command - modern desktop and gaming speech recognition with full SRGS grammar support, programmable multi threaded macros with independent data system. Windows 7,8 and 10.[6]
  • VAC - Voice Activated Commands is a feature rich speech recognition solution for games. It works with Windows 8, Windows 7, Windows Vista and Windows XP.
  • Voice Finger – software for Windows Vista and Windows 7 that improves the Windows speech recognition system by adding several extensions to accelerate and improve the mouse and keyboard control.
  • www.trigramtech.com – adds medical vocabularies(language models) for medical users. Can be licensed for individuals, groups or OEM integration into speech applications using Windows Speech.
  • Vocola – a macro language
  • Vocaya ([7]) is a software to bind keys to the voice. it is used with games and any other software. Community oriented to share, with the ability to edit knowledges.

Windows 7/8/10 third-party speech recognition

  • Auditory Sciences[8]—transcription software for captioning whatever someone says.
  • Braina - Dictate into third party software and websites.[9]
  • Dragon NaturallySpeaking from Nuance Communications – Successor to the older DragonDictate product. Focus on dictation. 64-bit Windows support since version 10.1.
  • Freesr Speech Recognition Software – Create voice interfaces for any application, window in an application, or website/webpage. Works with Windows Speech Recognition or as add-on to NaturallySpeaking.
  • SpeechGear Interact - combines speech recognition with language translation.[10]
  • Sonic Extractor from Digital Syphon – Supports 22 languages. Focus on broadcasting and telephony.
  • SpeechMagicNuance Communications acquired Philips owned. Medical industry focus according to Frost & Sullivan. Standalone or embedded.[11]
  • Tazti - Create speech command profiles to play PC games and control applications - programs. Create speech commands to open files, folders, webpages, applications. Windows 7, Windows 8 and Windows 8.1 versions.[12]
  • VoxCommando – Voice command utility for Windows Vista or later. It interfaces with various programs and devices to allow control of multimedia, communication, and home automation. Supports 20 world languages/dialects because users can choose to use Microsoft's desktop speech engine or Speech Platform 11. It does not use any cloud-based speech recognition platforms. Supports SAPI5 TTS voices.
  • Vestec Inc. - Specializing in Natural Language Understanding and Speech Recognition solutions. ASR, NLU and TTS engines support 17 languages in server, embedded (on low cost chip) or cloud based environments.

Windows XP or 2000 only

  • Microsoft Speech API – Speech recognition functionality included as part of Microsoft Office and on Tablet PCs running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface, and thus is unsuitable for end users.
  • e-Speaking – software for Windows XP that facilitates use of the Microsoft Speech API by adding ability to create commands to perform custom actions.
  • Vestec Inc. - Specializing in Natural Language Understanding and Speech Recognition solutions. ASR, NLU and TTS engines support 17 languages in server, embedded (on low cost chip) or cloud based environments.

Built-in software

Interactive voice response

The following are IVR/Interactive Voice response systems:

Unix-like x86 and x86_64 speech transcription software

Discontinued software

  • Game Commander 2 by Mindmaker. Gaming oriented voice recognition. Voice commands can be assigned to issue keystrokes and key combinations.
  • IBM ViaVoice – Embedded version still maintained by IBM.[17] No longer supported for versions above Windows Vista.[18] Untested above Mac OS X 10.4 or on Macintoshes with an Intel chipset.[19]
  • Quack.com (acquired by AOL) The name has now been reused for an iPad search app.
  • SpeechWorks from Nuance Communications.
  • Yap Speech Cloud - Speech-to-text platform acquired by Amazon.com.

See also

References

  1. "PowerSecretary Announcement".
  2. https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
  3. "SpeechTexter - Online Speech to Text Converter and Editor".
  4. http://www.hello-indigo.com/terms-of-use/
  5. http://www.voiceattack.com
  6. http://www.livrot.com
  7. http://www.vocaya.com
  8. http://www.massmatch.org/aboutus/listserv/2010/2010-03-31.html
  9. Braina Speech Recognition Software
  10. http://www.techradar.com/news/software/business-software/Speech-recognition-software-top-six-on-the-market/articleshow/44842011.cms
  11. Philips SpeechMagic named European Technology Leader by Frost & Sullivan
  12. O'Neill, Mark (2013-11-06). "Control your PC with these 5 speech recognition programs". PC World. Retrieved 2013-12-30.
  13. "Interactive Voice Response". Genesys.
  14. http://isl.ira.uka.de/downloads/asru_hagen.ps
  15. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=599557
  16. http://www.vocapia.com/voxsigma-speech-to-text.html
  17. http://www-01.ibm.com/software/pervasive/embedded_viavoice/
  18. http://nuance.custhelp.com/app/answers/detail/a_id/5775/p/31/c/980/r_id/100023
  19. http://nuance.custhelp.com/app/answers/detail/a_id/4987/related/1/p/31/c/980/r_id/100023
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.