Digital dictation
From Wikipedia, the free encyclopedia
Digital dictation is a method of recording and editing the spoken word in real-time within a digital audio format.
Digital dictation offers several advantages over traditional cassette tape based dictation:
- The user can instantly rewind or fast forward to any point within the dictation file to review or edit.
- The random access ability of digital audio allows to insert audio at any point without overwriting the following text.
- Dictation files can be transmitted electronically, e.g. via WAN, LAN, e-mail or FTP.
- Large dictation files can be shared with multiple typists.
- Digital dictation provides the ability to prioritize work.
- Sound is CD quality and can improve transcription accuracy and speed.
The interchange format for digital audio is WAV. However most digital diction systems use a lossy form of audio compression to minimize hard disk space.
In most cases, when someone uses digital dictation, their aim is to create a document. The process of converting digital audio to text can be done two ways.
1) Manual Transcription - Whereby the audio is played by a typist using a digital transcription software application and is normally controlled via a foot switch which allows the typists to PLAY, STOP, REWIND and BACKSPACE.
2) Voice Recognition - Whereby audio is analyzed by a computer using speech algorithms in an attempt to transcribe the document. Unlike a human, a computer needs to be trained to create a voice profile unique to the author. The hope of Voice Recognition is for you to talk and it to type, however VR technology does not work like this. It is paradoxical, but computers can not handle (accurately) simple words e.g. CAR, BAR, JAR as these words are monosyllabic. It is interesting that any 4 year old child can understand these words. Yet give a VR engine a multi-syllable word and the accuracy increases dramatically. Since VR technology is far from perfect, a correction process is therefore required if you want to create accurate documents. Which means either the author needs to do the correction as they go (real-time) but the downside being it takes the author more time to complete a document, else post correction is needed, whereby a digital audio file has VR applied to it and after the resultant text (with errors) is produced the typist does correction and formatting.
[edit] External links
- TransScriber is a small free utility that allows anyone to do transcribtion and control audio playback without a foot pedal. Without leaving the word processor, you can pause, rewind and slow down playback of your recording with easy shortcuts.