PlainTalk
From Wikipedia, the free encyclopedia
PlainTalk is the collective name for several speech synthesis (MacInTalk) and speech recognition technologies, developed by Apple Computer.
In 1990, Apple invested a lot of work and money into speech recognition technology, hiring many respected researchers in the field. The result was "PlainTalk," released with the AV Quadras of 1993. It was made a standard system component in System 7.1.2, and has since been shipped on all PowerPC and even some 68k Macintoshes.
Contents |
[edit] Software
[edit] Speech synthesis
[edit] Technology
Apple's text-to-speech uses diphones. Compared to other methods of synthesizing speech, it is not very resource-intensive, but there is a limit to how natural the synthesis can get. See the speech synthesis article for details. American English and Spanish versions have been available, but the current version supports exclusively American English.
An application programming interface known as the Speech Manager enables third-party developers to use speech synthesis in their applications. There are various control sequences that can be used to fine-tune the intonation and rhythm. The volume, pitch and rate of the speech can be configured as well.
[edit] The original MacInTalk
The first component of PlainTalk was a system extension called MacInTalk which provided text-to-speech conversion. It was used by Apple in the introduction of the Macintosh in 1984, to let the personal computer introduce itself to the world. The original MacInTalk was never supported officially by Apple. Software Automatic Mouth (S.A.M.) for the Apple II and other early personal computers was developed by Joseph Katz and Mark Barton for "Don't Ask Software", a company founded by Randy Simon to sell an ELIZA-type program called “Abuse”, which would converse with the user in an insulting manner. S.A.M. for the Macintosh was the first MacInTalk. Katz & Barton’s present company is now SoftVoice, which markets the present day version of this software as SoftVoice TTS.. [1]
[edit] MacInTalk 2
Eventually, Apple released a supported speech synthesis system, called MacInTalk 2. It supports any Macintosh running System Software 6.0.7 or later. It remained the recommended version for slower machines even after the release of MacInTalk 3 and Pro.
[edit] MacInTalk 3, Pro
With the increase in computer power that the DSPs in AV Macs and PowerPC based Macintoshes provided, Apple could afford to increase the quality of the synthesis. MacInTalk 3 required a 33 MHz 68030 processor and MacInTalk Pro required a 68040 or better and at least 1 MB of RAM. Each synthesizer supported a different set of voices.
[edit] Text-to-speech in Mac OS X
Text-to-speech has been a part of every Mac OS X version. The Victoria voice was enhanced significantly in Mac OS X v10.3, and was dubbed Vicki. The size of the voice was almost 20 times greater, because of the higher-quality diphone samples used. (sample)
A new, much more natural-sounding voice, called "Alex" is slated to be added to the Mac text-to-speech roster with the release of Mac OS X 10.5 "Leopard". [2]
[edit] In Music
The Macintalk speech synthesis can be heard in a few songs. It can be heard in "Satisfaction" by Benny Benassi, as well as other songs by him, and "Tobys Mac" By Toby Mac. Its is also used in one of Radiohead's "songs" Fitter Happier off their OK Computer album, and is featured in the background of Paranoid Android, also from the same album.
[edit] Speech recognition
Apple hired many speech recognition researchers in 1990. After about a year, they demoed a technology codenamed Casper. It was released as part of the PlainTalk package in 1993. Although available for all PowerPC Macintoshes and some 68k machines, it was not part of the default system install prior to Mac OS X. The user had to do a custom installation of the OS to get speech recognition capabilities.
Apple's speech recognition is voice-command oriented, i.e. not intended for dictation. It can be configured to listen for commands only when a hot key is pressed, or after being addressed with an activation phrase such as "Computer," or "Macintosh,". A graphical status monitor, often in the form of an animated character, provides visual and textual feedback about listening status, available commands and actions taken. It can also communicate back with the user using speech synthesis.
Early versions of the speech recognition provided full access to the menus. This support was later removed, since it required too many resources and made recognition less reliable, only to be re-added in Mac OS X 10.3 as a "universal access technology" called spoken user interface.
The user can launch items located in a special folder, called "Speakable Items", simply by speaking their name (while the system is in listening mode). Apple shipped a number of AppleScripts in this folder, but aliases, documents and folders can be opened in the same way.
Additional functionality is provided by individual applications. An application programming interface lets programs define and modify an available vocabulary. For example, the Finder provides a vocabulary for manipulating files and windows.
[edit] Hardware
Apple produced a microphone called "Apple PlainTalk Microphone". It was introduced alongside the AV-enabled Quadras in 1993 but was also sold separately. It had a longer connector, where the tip was used to provide it with extra power. It was designed to be positioned on top of the screen and to be sensitive to sound from the front.