List of speech recognition software

Open source acoustic models and speech corpus

The following list presents notable speech recognition software engines with a brief synopsis of characteristics.

Application name	Description	Website	Open Source	License	Operating System	Programming Language	Supported Language/Note
CMU Sphinx	HMM	CMU: Sourceforge	Yes	BSD style	Linux	Java	English
HTK	HMM	HTK Web Site	Yes	HTK Specific License	Multi-platform	C	English. Version 3.5 released December 2015.
Julius	HMM trigrams	Julius Home page	Yes	BSD-like	Multi-platform	C	English
Kaldi	Deep neural net.	Kaldi Web Site	Yes	Apache	Multi-platform	C++	English
iATROS	LDA (Latent Dirichlet)	iATROS	Yes	miss	Linux	C	English. Currently inactive (last update 2009)
RWTH ASR	RWTH Aachen University	RWTH ASR	No	RWTH ASR License	Linux, Mac OS X	?	English. Non-commercial use only

The following lists open-source applications that provide convenient user interfaces for the above.

Application name	Description	Website	Open Source	License	Operating System	Programming Language	Supported Language/Note
Simon	Supports Sphinx, HTK, Julius	Simon	Yes	GPLv2	Multi-platform	C++	English
Jasper project	Raspberry Pi front-end for CMU Sphinx or Julius	Jasper Project	Yes	MIT License	Linux	Python	English

Macintosh

Application name	Description	Open Source	License
Dragon Dictate	Mac OS	No	Proprietary
MacSpeech Dictate Medical	Medical dictation product
Macspeech Dictate Legal	Legal-focused dictation
MacSpeech Scribe	Transcription from recorded text
iListen	PowerPC Macintosh
Speakable items	Included with Mac OS
ViaVoice	IBM Product. Support ended 2007.
Voice Navigator	Original GUI voice control (1989)
Power Secretary^[1]
Vestec Inc.	ASR, NLU, TTS, VSLIC

Cross-platform web apps based on Chrome

The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API.^[2]

Application name	Description	Website	Open Source	License	Price	Note
Speechnotes	Dictation notepad - professional speech recognizing text editor web app	Speechnotes Web App	No	Commercial	Free

Mobile devices and smartphones

Many cell-phone handsets have basic dial-by-voice features built in. Smartphones such as iPhones and BlackBerrys also support this. A number of third-party apps have implemented natural-language speech-recognition support, including:

Application name	Description	Website	Open Source	License	Price
Assistant.ai	Assistant for Android, iOS and Windows Phone	Assistant.ai			Free
Indigo	Virtual Assistant for Android, iOS, and WP, by Artificial Solutions	App Website	No^[3]	Commercial	Free
Textshark	Cloud-/API-based speech-to-text transcription	Speech to Text Transcription	No	Commercial	$100–$1000/mo
VoiceBoost SIVA	Embedded speech recognition for wake-up and phrase-spotting	Malaspina-Labs	No	Commercial
VoiceBoost SDVA	Embedded Speaker Verification/Identification	Malaspina-Labs	No	Commercial
VoiceBoost ECC	Embedded speech command and control— small Vocab ASR Engine	Malaspina-Labs	No	Commercial
TrulyHandsfree	Embedded speech recognition for wakeup and command and control	Sensory	No	Commercial
TrulyNatural	Embedded large vocabulary speech recognition for natural language	Sensory	No	Commercial
Sonic Cloud Online Speech
S-voice	Samsung Galaxy's Voice based personal assistant		No	Commercial
Verbio ASR embedded	Embedded and Cloud speech recognition for natural language	Embedded Speech Recognition
Dragon Dictation			No		Free
Google Now	Android voice search		No		Free
Google Voice Search			No		Free
Microsoft Cortana	Microsoft voice search		No		Free
GoVivace	Cloud-based speech recognition	Automatic Speech Recognition		Commercial
Siri Personal Assistant	Apple's virtual personal assistant		No		Free
MeMeMe Mobile	Cloud-based speech recognition
SILVIA	Android and iOS		No
Shoutout
Vlingo
Jeannie	Android
Ziri	Android
Microsoft Tellme	Windows Phone 7/8				Free
Ask Ziggy	Windows Phone 7
fcGlobal	Terminal
Vocre	iOS
Utter!	Voice to speech personal assistant	Utter! Commands Beta	No		Free
Vestec ASR, NLU, TTS embedded			No

Windows

Windows built-in speech recognition

The Windows Speech Recognition by Microsoft is the speech recognition system that comes built into Windows Vista, Windows 7, and Windows 8. Windows Vista, Windows 7, and Windows 8 include version 8.0 of the Microsoft speech recognition engine. Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows. That means that you can not use the French speech recognition engine if you use an English version of Windows. Windows 7 Ultimate and Windows 8 Pro allow you to change the system language, and therefore change which speech engine is available. Windows Speech Recognition evolved into Cortana (software) a personal assistant found in Windows 10.

Add-ons for Windows 7 speech recognition

VoiceAttack - is used primarily by the gaming community to allow hands-free keyboard and mouse input in Windows 8, Windows 7, Windows Vista and Windows XP. Its popularity lies mainly in its ease of use and extended feature set, which includes the ability to create multi-threaded macros.^[4]
VAC - Voice Activated Commands is a feature rich speech recognition solution for games. It works with Windows 8, Windows 7, Windows Vista and Windows XP.
Voice Finger – software for Windows Vista and Windows 7 that improves the Windows speech recognition system by adding several extensions to accelerate and improve the mouse and keyboard control.
WSRToolkit – adds dictionaries, macros and other features similar to Dragon
www.trigramtech.com – adds medical vocabularies(language models) for medical users. Can be licensed for individuals, groups or OEM integration into speech applications using Windows Speech.
Vocola – a macro language

Windows 7/8/10 third-party speech recognition

Auditory Sciences^[5]—transcription software for captioning whatever someone says.
Braina - Dictate into third party software and websites.^[6]
Dragon NaturallySpeaking from Nuance Communications – Successor to the older DragonDictate product. Focus on dictation. 64-bit Windows support since version 10.1.
Freesr Speech Recognition Software – Create voice interfaces for any application, window in an application, or website/webpage. Works with Windows Speech Recognition or as add-on to NaturallySpeaking.
SpeechGear Interact - combines speech recognition with language translation.^[7]
Sonic Extractor from Digital Syphon – Supports 22 languages. Focus on broadcasting and telephony.
SpeechMagic – Nuance Communications acquired Philips owned. Medical industry focus according to Frost & Sullivan. Standalone or embedded.^[8]^[9]
Tazti - Create speech command profiles to play PC games and control applications - programs. Create speech commands to open files, folders, webpages, applications. Windows 7, Windows 8 and Windows 8.1 versions.^[10]
VoxCommando – Voice command utility for Windows Vista or later. It interfaces with various programs and devices to allow control of multimedia, communication, and home automation.
Vestec Inc. - Specializing in Natural Language Understanding and Speech Recognition solutions. ASR, NLU and TTS engines support 17 languages in server, embedded (on low cost chip) or cloud based environments.

Windows XP or 2000 only

e-Speaking – software for Windows XP that facilitates use of the Microsoft Speech API by adding ability to create commands to perform custom actions.
Microsoft Speech API – Speech recognition functionality included as part of Microsoft Office and on Tablet PCs running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface, and thus is unsuitable for end users.
Vestec Inc. - Specializing in Natural Language Understanding and Speech Recognition solutions. ASR, NLU and TTS engines support 17 languages in server, embedded (on low cost chip) or cloud based environments.

Built-in software

Microsoft Kinect includes built-in software which allow speech-recognition of commands.
Older generation of Nokia phones like Nokia N Series ( before using Windows 7 mobile technology) used speech-recognition with family names from contact list and other few commands.
Siri—originally implemented in the iPhone 4S—Apple's personal assistant for iOS which uses technology from Nuance Communications.
Cortana (software) Microsoft's personal assistant built into Windows Phone and Windows 10.

Interactive voice response

The following are IVR/Interactive Voice response systems:

AT&T Watson
CSLU Toolkit
Convergys Interactive Voice Portal IVR
Genesys^[11]
HTK — copyrighted by Microsoft, but altering the software for the Licensee's internal use is allowed.
Freesr Speech Recognition Software
Verbio ASR & TTS
LumenVox ASR
MIRSK ASR
Telisma ASR
Nuance Recognizer ASR
Rubidium Ltd. ASR
Proteus Conversational Interface
Simmortel Voice
Tellme Networks (acquired by Microsoft)
Parlance nameConnector
Verbyx Inc
Vestec Inc

Unix-like x86 and x86_64 speech transcription software

Discontinued software

Game Commander 2 by Mindmaker. Gaming oriented voice recognition. Voice commands can be assigned to issue keystrokes and key combinations.
IBM ViaVoice – Embedded version still maintained by IBM.^[15] No longer supported for versions above Windows Vista.^[16] Untested above Mac OS X 10.4 or on Macintoshes with an Intel chipset.^[17]
Quack.com (acquired by AOL) The name has now been reused for an iPad search app.
SpeechWorks from Nuance Communications.
Yap Speech Cloud - Speech-to-text platform acquired by Amazon.com.

References

This article is issued from Wikipedia - version of the Monday, February 08, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.