Automatic speech

Automatic speech (also known as embolalia) refers to the verbalization of different words or phrases that occur without the conscious effort of the individual.^[1] This type of speech component often serves as verbal filler during the middle of a presentation or conversation.^[2] It consists of words not directly under the control of a person's conscious mind, and are spoken without thought. Such speech includes false starts, hesitations, repetitions that accompany words that speakers plan and utter coherent sentences,^[3] and filler words (such as "Like", "Er" and "Uhm").

The word embolalia comes from the Greek word embolos which means 'something thrown in', from the word emballo- meaning 'to throw in',^[4] and -lalia meaning 'speech, chattering and babbling; abnormal or disordered form of speech.^[5]

Background

Modern linguists led by Leonard Bloomfield in 1933 call these "hesitation forms"—the sounds of stammering (uh), stuttering (um, um), throat-clearing (ahem!), stalling (well, um, that is), interjected when the speaker is groping for words or at a loss for the next thought.^[6]

French psychiatrist Jules Séglas, on the other hand, referred to the term embolalia, as "the regular addition of prefixes or suffixes to words", and mentioned that the behavior is sometimes used by normal individuals to demonstrate to their interlocutor that they are paying attention to the conversation.^[7]

Harry Levin and Irene Silverman called automatic speech "vocal segregates" in their 1965 paper on hesitation phenomena and found out from their experiments on children that these segregates seem to be less voluntary hesitation phenomena and may be signs of uncontrolled emotionality under stress.^[8]

The Irish poet William Butler Yeats argued for automatic speech experiments with his wife,^[9] which provided him with symbols for his poetry as well as literary theories.^[10]

Characteristics of automatic speech

Linguistic features

Morphology and phonology

Filled pauses

Filled pauses consist of repetitions of syllables and words, reformulation or false starts where the speaker rephrases his speech to fit the representation he best perceives, grammatical repairs, and partial repeats where speakers are often searching for the right words in their lexicon to carry across their intended meaning.^[11] There are basically three distinct forms for filled pauses: (i) an elongated central vowel only; (ii) a nasal murmur only; and (iii) a central vowel followed by a nasal murmur.^[12] Although a schwa-like quality [ə:], appears to be the most commonly used, some speakers consistently using the neutral vowel [ɨ:] instead, and others use both vowels in the same sentence, depending on the quality of the previous word last vowel.^[12] Filled pauses vocalizations may be built around central vowels and speakers may differ in their preferences, but that they do not appear to behave as other words in the language.^[12] The lengthening of words ending in a coronal fricative, for instance, could be obtained by prolonging the entire rhyme and/or the fricative only.^[12] Most of the time, however, the neutral vowel [ɨ:] is appended to achieve the desired effect.^[12]

Prolonged pauses

Similarly to filled pauses, single occurrences of prolonged pauses occurring between stretches of fluent speech, may be preceded and followed by silent pauses, as they most often occur on function words with a CV or V structure.^[12] Even though they are not always central, the vowels of such syllables may be as long as the ones observed for filled pauses.^[12]

Retraced and unretraced restarts

Riggenbach’s 1991 study of fluency development in Chinese learners of English had an analysis of repair phenomena, which included retraced restarts and unretraced restarts.^[13] Retraced restarts refer to the reformulations whereby a portion of the original utterance is duplicated.^[13] They can either involve repetition, that is, the precise adjacent duplication of a sound, syllable, word or phrase, or insertion, which refers to a retraced restart with the addition of new unretraced lexical items.^[13] Conversely, unretraced restarts refer to reformulations that rejects the original utterance, similarly known as false starts.^[13]

Semantics and pragmatics

The semantics of automatic speech have often been debated on, and to date, there lacks a consensus on whether or not filler words are intentional in speech and whether or not they should be considered as words or if they are simply side effects of difficulties in the planning process of speech by speakers. Bailey & Ferriera’s (2007) paper^[14] found that there is little evidence to suggest that the use of filler words are intentional in speech and that they should not be considered as words in the conventional sense.

Filler words consist of "Non-lexical fillers" and "Lexical fillers".^[13] "Non-lexical fillers" are recognized as fillers that are not words and "Lexical fillers" are recognized as fillers that are words and both types of fillers are thought to contain little or no semantic information.^[13] However, some filler words are used to express certain speech acts. "Yeah", a "Lexical filler", is used to give affirmation, introduce a new topic, shows speaker's perception and understanding, and occurs after a speech management problem when the speakers does not how to continue their speech.^[15] Fillers like "Mmmm", a "Non-lexical filler", and "Well", a "Lexical filler", are also said to signal listener's understanding of the information provided.^[15]

Research has shown that people were less likely to use automatic speech in general topics and domains they were more well-versed in, because they were more adept at selecting the appropriate terms.^[16] To date, there is insufficient research done to say if fillers are a part of integral meaning, or if they are aspect of performance,^[17] but we can say that they are useful in facilitating information for the listener.^[3]

Syntax

Automatic Speech is more likely to occur at the beginning of utterance or phrase and the reason is because it is presumed that there is a greater demand on planning processes at these junctures.^[18] Features of automatic speech, like filled pauses or repetitions, are most likely to occur immediately prior to the onset of a complex syntactic constituent.^[19] Filled pauses are also likely after the initial word in a complex constituent, especially after function words.^[19] Therefore, listeners might be able to use the presence of a recent filled pause to predict that an ambiguous structure, and this trait is in favor of a more complex analysis .^[14]

There are several different types of automatic speech. One type is relatively universal, often transcending differences in language and to some degree culture. Simple fillers like “Uhm,” “Uh,” or “Er” are used by many different people in many different settings.^[20] For the most part, these types of fillers are considered innocuous, and are often overlooked by listeners, as long as they are not utilized so often that they overshadow the remainder of the conversation.^[21]

Other forms of automatic speech are ingrained within specific cultures, and in fact are sometimes considered an identifying characteristic of people who share a particular religion, or live in a specific geographical region.^[21] Along with accents, automatic speech of this type is sometimes considered colorful and somewhat entertaining. Writers often make use of this type of speech to give the characters in their writings additional personality, helping to make them unique.^[22]

Fluency

The study conducted by Dechert (1980) that investigated the speech performance of a German student of English revealed that there is a tendency for speech pauses to be situated at breaks that are consistent with “episodic units”.^[23] Dechert (1980) found that the more fluent utterances exhibited more pauses at those junctures and lesser within the "episodic units", leading him to posit that the study subject was able to use the narrative structure to pace his own speech with natural breaks in order for him to scout for the words and phrases that are to follow subsequently.^[23]

Through the comparison of the story retelling utterances collated of second language learners, Lennon (1984) discovered notable disparities in the distribution of pauses between recounting in the research subjects’ first and second languages respectively.^[24] The study found that all of the pauses were found to be located either at clause breaks or following nonintegral components of the clause, without pauses within the clauses.^[24] On the other hand, the narrators who spoke using their second language exhibited different patterns, with a higher frequency of pauses occurring within the clauses, leading to the conclusion posited by Lennon to be that the speakers seem to be “planning within clauses as well as in suprasegmental units”, and hence, the occurrence of pauses within clauses and not at the intersection of clauses could well be an indicator distinguishing fluent and confluent speech.^[24]

Discourse features

Cognitive load

Cognitive load is an important predictor of automatic speech.^[3] More disfluency is found in longer utterances^[25] and when the topic is unfamiliar.^[3] In Wood's book, he suggested that when a high degree of cognitive load occurs, such as during expository speech or impromptu descriptions of complex interrelated topics, even native speakers can suffer from disfluency.^[26]

Speech rate

Speech rate is closely related to cognitive load of speakers as well.^[27] Depending on the cognitive load, the rate of a speaker's utterances are produced either faster or slower, in comparison to a fixed speaking rate which happens usually.^[27] For example, speech rate becomes slower when having to make choices that are not anticipated, and tend to accelerate when words are being repeated.^[27] In fast conditions, cognitive processes that result in a phonetic plan, fail to keep up with articulation, and thus, the articulation of the existing plan is restarted,^[28] resulting in the repetition of words which is more likely to happen but no more likely than fillers.^[3]

Frequency of words

In Beattie and Butterworth’s (1979) study, low frequency content words and those rated as contextually improbable were preceded by hesitations such as fillers.^[29] Speakers, when choosing to use low frequency words in their speech, are aware, and are more likely to be disfluent.^[29] This is further supported by Schnadt and Corley where they found that prolongations and fillers increased in words just before multiple-named or low frequency items.^[18]

Domain (addressor vs. addressee)

Humans are found to be more disfluent overall when addressing other humans as compared to when addressing machines.^[30] More instances of automatic speech is found in dialogues than in monologues.^[30] The different roles the addresser played (such as a sister, a daughter or a mother) greatly influences the numbers of disfluencies, particularly, fillers produced, regardless of length or complexity.^[31]

Functions

Comprehension cues

There is a common agreement that disfluencies are accompanied by important modifications both at the segmental and prosodic levels and that speakers and listeners use such cues systematically and meaningfully. Thus they appear as linguistic universal devices that are similar to other devices and are controlled by the speaker and regulated by language specific constraints.^[12] In addition, speech disfluencies such as fillers can help listeners to identify upcoming words.^[32]

While automatic speech can serve as a useful cue that more is to come, some people do develop an unconscious dependence on these filler words.^[33] When this is the case, it is necessary to correct the problem by making the speaker be aware of their over-reliance on automatic speech production and by training the person to make more efficient use of other verbal strategies. As the individual gains confidence and is less apt to have a need for filler words, the predilection toward automatic speech is then able to gradually diminish.^[22]

A study done by Foxtree (2001)^[34] showed that both English and Dutch listeners were faster to identify words in a carrier sentence when it was preceded with an “Uh” instead of without an “Uh", which suggested that different fillers have different effects as they might be conveying different information.^[3]

Fischer and Brandt-Pook also found out that discourse particles mark thematic breaks, signal the relatedness between the preceding and following utterance, indicate if the speaker has understood the content communicated, and support the formulation process by signalling possible problems in speech management.^[15]

While fillers might give listeners cues about the information being conveyed, Bailey & Ferreira's study^[35] made a distinction between "Good Cues" and "Bad Cues" in facilitating listener’s comprehension. A "Good Cue" leads the listener to correctly predict the onset of a new constituent (Noun Phrase, Verb Phrase), whereas a "Bad Cue" leads the listener to incorrectly predict the onset of a new constituent.^[35] "Good Cue" make it easier for listeners to process the information they have been presented while "Bad Cue" make it harder for listeners process the relevant information.^[35]

There is strong empirical evidence that speakers use automatic speech in similar ways across languages and that automatic language plays a fundamental role in the structuring of spontaneous speech, as they are used to achieve a better synchronization between interlocutors by announcing upcoming topic changes, delays related to planning load or preparedness problems, as well as speaker’s intentions to take/give the floor or to revise/abandon an expression he/she had already presented.^[12]

Communicative goals

A study conducted by Clark and Foxtree (2002)^[36] mentioned that parts of automatic speech, such as fillers, serve a communicative function and are considered integral to the information the speaker tries to convey, although they do not add to the propositional content or the primary message.^[36] Instead, they are considered part of a collateral message where the speaker is commenting on her performance.^[36] Speakers produce filled pauses (e.g. "Uh" or "Um") for a variety of reasons, including the intention to discourage interruptions or to gain additional time to plan utterances.^[14]

Another communicative goal includes the attention-impelling function,^[6] which explores another purpose of hesitation forms as being to dissociate oneself slightly from the harsh reality of what is to follow.^[6] With the use of a beat of time filled with a meaningless interjection, uncommitted people who are "into distancing" make use of such automatic speech to create a little distance between themselves and their words, as if it might lessen the impact of their words.^[6]

However, not all forms of automatic speech are considered appropriate or harmless.There are examples of automatic speech production that lean towards being offensive, for instance, the use of anything considered to be profanity within a given culture.^[22]

In this form, the speech is usually the insertion of swear words within the sentence structure used to convey various ideas. At times, this use of automatic speech comes about due to the individual being greatly distressed or angry.^[22] However, there are situations where swear words are inserted unconsciously even if the individual is extremely happy.^[22] When the use of swear words is called to the attention of the individual, he or she may not even have been aware of the usage of such automatic speech.^[22]

Neurological basis

Medical cases

Aphasia

Main article: Aphasia

Many patients who suffer from aphasia retain the ability to produce automatic speech, which often consists of conversational placeholders like "um" and "er." The automatic speech of aphasics can include swear words - in some cases, patients are unable to create words or sentences, but they are able to swear. Also, the ability to pronounce other words can change and evolve during the process of recovery, while pronunciation and use of swear words remain unchanged.^[37]

Patients who are affected by transcortical sensory aphasia, a rare form of aphasia, have been found to exhibit automatic speech that is characterised by “lengthy chunks of memorized material”.^[38]

Apraxia

Main article: Apraxia

Apraxia can also occur in conjunction with dysarthria (muscle weakness affecting speech production) or aphasia (language difficulties related to neurological damage).^[39]

One of the articulatory characteristics of apraxia found in adults includes speech behavior that "exhibits fewer errors with automatic speech than volitional speech”.^[40] Developmental verbal dyspraxia has also been found to have more effect on volitional speech than on automatic speech.^[41]

The characteristics of apraxia of speech include difficulties in imitating speech sounds, imitating no-speech movements, such as sticking out the tongue, groping for sounds, and in severe cases, the inability to produce any sounds, inconsistent errors and a slow rate of speech. However, patients who suffer from apraxia of speech may retain the ability to produce automatic speech, such as “thank you” or “how are you?”.^[39] Apraxia can also occur in conjunction with dysarthria, an illness which inflicts muscle weakness affecting speech production), or aphasia, which causes language difficulties related to neurological damage.^[39]

Developmental coordination disorder

Main article: Developmental coordination disorder

Developmental coordination disorder is a chronic neurological disorder that affects the voluntary movements of speech.^[42] A child affected by developmental coordination disorder may be able to say certain words or phrases spontaneously, constituting a form of automatic speech, but may be unable to repeat them on request, thus constituting the inability to formulate a certain form of voluntary speech.^[42]

References

↑ About.com- What is Embolalia
↑ Ward, N. (2000). "Issues in the Transcription of English Conversational Grunts.". In 1st SIGdial Workshop on Discourse and Dialogue.
↑ 3.0 3.1 3.2 3.3 3.4 3.5 Corley, M.; Stewart, O. W. (2008). "Hesitation Disfluencies in Spontaneous Speech: The Meaning of um.". Language and Linguistics Compass: 589–602.
↑ Mondofacto - Etymology of embolia
↑ Wordinfo - Definition of embolalia
↑ 6.0 6.1 6.2 6.3 Safire, William (16 June 1991). "On Language; Impregnating the Pause". The New York Times. p. 8.
↑ Obler, Loraine K.; Albert, Martin L. (1985), "Historical Note: Jules Seglas on Language in Dementia", Brain and Language 24 (2): 314–325, doi:10.1016/0093-934X(85)90138-5
↑ Levin, Harry; Silverman, Irene (1965), "Hesitation Phenomena in Children's Speech", Language and Speech 8 (2): 67–85, doi:10.1177/002383096500800201
↑ Dekel, Gil (2008), "Wordless Silence of Poetic Mind: Outlining and Visualising Poetic Experiences through Artmaking", Forum: Qualitative Social Research 9 (2)
↑ An Overview of Yeats A Vision
↑ Freed, B. (1995). Second Language Acquisition in a Study Abroad Context. Amsterdam /Philadelphia: John Benjamins Publishing Company.
↑ 12.0 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 Moniz, H.; Mata, A. I. and Viana, M. C. (2007). "On Filled Pauses and Prolongations in European Portuguese.". Interspeech: 2645–2648.
↑ 13.0 13.1 13.2 13.3 13.4 13.5 Riggenbach, H. (1991). Towards an understanding of fluency: A microanalysis of nonnative speaker conversations. Discourse Processes, 14: 423–41.
↑ 14.0 14.1 14.2 Bailey, Karl G. D.; Ferreira, Fernanda (2007), "The processing of filled pause disfluencies in the visual world", Eye Movements a Window on Mind and Brain: 487–502, ISBN 9780080449807
↑ 15.0 15.1 15.2 Fischer, K.; Brandt-Pook, H. (1998), Automatic Disambiguation of Discourse Particles, pp. 107–113
↑ Schachter, S.; F. Rauscher, N. Christenfeld, and K. Tyson Crone. (1994). "The vocabularies of academia.". Psychological Science 5: 37–41.
↑ Brennan, S. E.; Williams, M. (1995), "The feeling of another’s knowing: Prosody and filled pauses as cues to listeners about the metacognitive states of speakers", Journal of Memory and Language 34: 383–398, doi:10.1006/jmla.1995.1017
↑ 18.0 18.1 Schnadt, M. J., and M. Corley. submitted. Buying time in spontaneous speech: How speakers accommodate lexical difficulty.
↑ 19.0 19.1 Clark, H. H.; Wasow, T. (1998). "Repeating words in spontaneous speech." (PDF). Cognitive Psychology: 201–242. doi:10.1006/cogp.1998.0693.
↑ Blackwell Reference Online - Formulaic Sequences and Language Disorders.
↑ 21.0 21.1 Kuniper, K. (2000). "On the Linguistic Properties of Formulaic Speech.". Oral Tradition: 279–305.
↑ 22.0 22.1 22.2 22.3 22.4 22.5 Wisegeek.com - What is Automatic Speech
↑ 23.0 23.1 [Dechert, HW. (1980). Pauses and intonation as indicators of verbal planning in second-language speech productions: Two examples from a case study. In Dechert, HW & Raupach, M. (Eds.), Temporal variables in speech (pp. 271-285).]
↑ 24.0 24.1 24.2 [Lennon, P. (1984). Retelling a story in English. In HW, Dechert, D. Mehle, 8c M. Raupauch (Eds.), Second Language Productions (pp. 50-68). Turbingen: Gunter Narr Verlag.]
↑ Shriberg, E. (1996), "Disfluencies in Switchboard", Proceedings, International Conference on Spoken Language Processing, Addendum: 11–14
↑ David Wood (1 September 2010). Formulaic Language and Second Language Speech Fluency: Background, Evidence and Classroom Applications. Continuum International Publishing Group. ISBN 978-1-4411-5819-2. Retrieved 23 March 2012.
↑ 27.0 27.1 27.2 O'Shaughnessy, D. (1995), "Timing patterns in fluent and disfluent spontaneous speech", Acoustics, Speech, and Signal Processing 1: 600–603, doi:10.1109/ICASSP.1995.479669
↑ Blacfkmer, Elizabeth R.; Mitton, Janet L. (1991), "Theories of monitoring and the timing of repairs in spontaneous speech", Cognition 39 (3): 173–194, doi:10.1016/0010-0277(91)90052-6
↑ 29.0 29.1 Beattie, G. W.; Butterworth, B. L. (1979), "Contextual probability and word frequency as determinants of pauses and errors in spontaneous speech", Language and Speech 22: 201–211
↑ 30.0 30.1 Oviatt, S. (1995), "Predicting spoken disfluencies during human-computer interaction", Computer Speech and Language 9: 19–35, doi:10.1006/csla.1995.0002
↑ Bortfeld, H.; Leon, J. E. Bloom, M. F. Schober, and S. E. Brennan (2001), "Disfluency rates in conversation: Effects of age, relationship, topic, role, and gender", Language and Speech 44: 123–147, doi:10.1177/00238309010440020101
↑ Brennan, S. E.; Schober, M. F. (2001), "How listeners compensate for disfluencies in spontaneous speech", Journal of Memory and Language 44: 274–296, doi:10.1006/jmla.2000.2753
↑ Yang, Li-Chiung (2001), "Visualizing Spoken Discourse: Prosodic Form and Discourse Functions of Interruptions", In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, doi:10.3115/1118078.1118106
↑ Fox Tree, J. E. (2001). Listeners’ uses of um and uh in speech comprehension. Memory & Cognition 29.320-326.
↑ 35.0 35.1 35.2 Bailey, K. G. B.; Ferreira, F. (2003), "Disfluencies influence syntactic parsing", Journal of Memory and Language 49: 183–200, doi:10.1016/s0749-596x(03)00027-5
↑ 36.0 36.1 36.2 Clark, H. H.; Fox Tree, J. E. (2002), "Using uh and um in spontaneous speaking", Cognition 84: 73–111, doi:10.1016/S0010-0277(02)00017-3, PMID 12062148
↑ Wilson, Tracy V. (2005). How Swearing Works. HowStuffWorks.com Retrieved on 9 March 2012.
↑ McCaffrey, Patrick. Transcortical Sensory Aphasia. The Neuroscience on the Web Series: Neuropathologies of Language and Cognition
↑ 39.0 39.1 39.2 Britchkow, Ela. (2005). Apraxia. Speakeffectively.com
↑ Ogar, J.; Slama, H.; Dronkers, N.; Amici, S.; Gorno-Tempini, M. L. (2005), "Apraxia of Speech: An overview", Neurocase 11: 427–432, doi:10.1080/13554790500263529, PMID 16393756
↑ Velleman, Shelley L. Childhood apraxia of speech (developmental verbal dyspraxia). Retrieved on 9 March 2012.
↑ 42.0 42.1 Portelli, J., "Developmental Verbal Dyspraxia", Association of Speech and Language Pathologists of Malta