Speech signal processing

From Wikipedia, the free encyclopedia

Speech signal processing refers to the acquisition, manipulation, storage, transfer and output of human utterances by a computer. The main goals are the recognition, synthesis and compression of human speech:

  • Speech recognition (also called voice recognition) focuses on capturing the human voice as a digital sound wave and converting it into a computer-readable format.
  • Speech synthesis is the reverse process of speech recognition. Advances in this area improve the computers' usability for the visually impaired.
  • Speech compression is important in the telecommunications area for increasing the amount of info which can be transferred, stored, or heard, for a given set of time and space constraints.

[edit] Books

  • Multilingual Speech Processing, Edited by Tanja Schultz and Katrin Kirchhoff, April 2006--Researchers and developers in industry and academia with different backgrounds but a common interest in multilingual speech processing will find an excellent overview of research problems and solutions detailed from theoretical and practical perspectives.---CH 1: Introduction / CH 2: Language Characteristics / CH 3: Linguistic Data Resources / CH 4: Multilingual Acoustic Modeling / CH 5: Multilingual Dictionaries / CH 6: Multilingual Language Modeling / CH 7: Multilingual Speech Synthesis / CH 8: Automatic Language Identification / CH 9: Other Challenges / CH 10: Speech-to-Speech Translation / CH 11: Multilingual Spoken Dialog Systems / Bibliography

.