Phonological Change

Cognitive Pressures on the Sound System

  • First Online: 28 October 2023

Cite this chapter

modification of phonemes in speech

  • Remco Knooihuizen 2  

322 Accesses

In this chapter, we continue looking at sound change, but now we consider sounds as part of a phonological system. As phonemes are defined contrastively and function to distinguish meaning, the phonological system may react when phonetic change threatens existing distinctions. This may stop change from happening in the first place, or it may set in motion a chain of related sound changes. The Great Vowel Shift is an example of such a chain shift, and we discuss a number of ongoing chain shifts in different varieties of English. However, articulatorily driven change may also override the system and cause categories to merge or split. A closer look at the foot/strut and the trap/bath splits in the history of English forces us to reconsider the universality of Neogrammarian sound change and to discover lexical diffusion as a different pathway of change.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Note here that we are talking about the phoneme system of Old English. The voiceless fricative phonemes were pronounced as voiced fricatives in certain contexts, but as these contexts were predictable and there were no contrasts between voiced and voiceless fricatives as there were between voiced and voiceless stops, we say that Old English did not have voiced fricatives as a separate category .

Because Grimm’s Law happened so long ago, and the end result of a push chain or drag chain is ultimately the same, we simply do not know for sure what type of chain shift Grimm’s Law really was.

In present-day broad Australian English, /iː/ is phonetically produced as [ɪi], providing some evidence for this first step. There are potentially articulatory reasons for this: it is difficult to sustain precisely the same tongue position over a ‘longer’ period of time (some 100–200 milliseconds), and the slightly more lax start of the vowel represents the stage before the tongue reaches the (tense) target position.

These examples come from a short (2:27) YouTube video from an early-2000s documentary where William Labov explains some of the experiments he has done around the NCS: https://www.youtube.com/watch?v=9UoJ1-ZGb1w .

Word-initial /ð/ may also surface as /d/, so that /dɪs/ and /vɪs/ are both alternative pronunciations for this .

See Martinet ( 1952 , pp. 3–12) for an early discussion in English of the concept, albeit as ‘functional yield’. The concept had earlier been described in French and German by, among others, Jules Gilliéron, Roman Jakobson, and Nikolaj Trubetzkoy (Wedel et al. 2013 ).

The most spectacular example of a merger is the Modern Greek phoneme /i/, which is the end result of historical mergers involving what were originally nine different vowels in Ancient Greek (Johnson 2010 , p. 1).

Long /uː/ itself was the result of the Great Vowel Shift applied to Middle English /oː/. This is why many words with [ʌ] or [ʊ] are spelled with ⟨ oo ⟩.

Around the same time as the trap/bath split, words from the lot lexical underwent lengthening in similar conditions, resulting in the cloth lexical set. This also appears to have progressed with lexical diffusion and was later partially reversed, so that the outcome is even more haphazard (Wells 1982 , p. 234).

Barras, Will. 2006. The square–nurse merger in Greater Manchester: The impact of social and spatial identity on phonological variation. MA thesis, University of Edinburgh.

Google Scholar  

Bauer, Laurie. 1992. The second Great Vowel Shift revisited. English World-Wide 13 (2): 253–268. https://doi.org/10.1075/eww.13.2.04bau .

Article   Google Scholar  

Becker, Kara, ed. 2019. The Low-Back-Merger Shift: Uniting the Canadian Vowel Shift, the California Vowel Shift, and short front vowel shifts across North America (Publications of the American Dialect Society 104). Durham, NC: American Dialect Society.

Boberg, Charles. 2008. Regional phonetic differentiation in Standard Canadian English. Journal of English Linguistics 36 (2): 129–154. https://doi.org/10.1177/0075424208316648 .

Boberg, Charles. 2019. A closer look at the short front vowel shift in Canada. Journal of English Linguistics 47 (2): 91–119. https://doi.org/10.1177/0075424219831353 .

Brand, James, Jen Hay, Lynn Clark, Kevin Watson, and Márton Sóskuthy. 2021. Systematic co-variation of monophthongs across speakers of New Zealand English. Journal of Phonetics 88: 101096. https://doi.org/10.1016/j.wocn.2021.101096 .

Campbell, Lyle. 2004. Historical linguistics: An introduction , 2nd ed. Edinburgh: Edinburgh University Press.

Chambers, J.K. 1992. Dialect acquisition. Language 68 (4): 673–705. https://doi.org/10.2307/416850 .

Cheng, Chin-Chuan, and William S.-Y. Wang. 1977. Tone change in Chao-zhou Chinese: A study in lexical diffusion. In The lexicon in phonological change , ed. William S.-Y. Wang, 86–100. Den Haag: Mouton. https://doi.org/10.1515/9783110802399.86 .

Clarke, Sandra, Ford Elms, and Amani Youssef. 1995. The third dialect of English: Some Canadian evidence. Language Variation and Change 7 (2): 209–228. https://doi.org/10.1017/S0954394500000995 .

D’Onofrio, Annette, and Jaime Benheim. 2020. Contextualizing reversal: Local dynamics of the Northern Cities Shift in a Chicago community. Journal of Sociolinguistics 24 (4): 469–491. https://doi.org/10.1111/josl.12398 .

Eckert, Penelope. 2001. Style and social meaning. In Style and sociolinguistic variation , ed. Penelope Eckert and John R. Rickford, 119–126. Cambridge: Cambridge University Press.

Harrington, Jonathan, Sallyanne Palethorpe, and Catherine Watson. 2000. Monophthongal vowel changes in Received Pronunciation: An acoustic analysis of the Queen’s Christmas broadcasts. Journal of the International Phonetic Association 30 (1–2): 63–78. https://doi.org/10.1017/S0025100300006666 .

Hay, Jennifer B., Janet B. Pierrehumbert, Abby J. Walker, and Patrick LaShell. 2015. Tracking word frequency effects through 130 years of sound change. Cognition 139: 83–91. https://doi.org/10.1016/j.cognition.2015.02.012 .

Horvath, Barbara M., and Ronald J. Horvath. 2001. A geolinguistics of short A in Australian English. In English in Australia , ed. David Blair and Peter Collins, 341–355. Amsterdam: Benjamins.

Chapter   Google Scholar  

Howell, Robert B. 2006. Immigration and koineisation: The formation of Early Modern Dutch urban vernaculars. Transactions of the Philological Society 104 (2): 207–227. https://doi.org/10.1111/j.1467-968X.2006.00169.x .

Johnson, Daniel Ezra. 2010. Stability and change along a dialect boundary: The low vowels of Southeastern New England (Publications of the American Dialect Society 95). Durham, NC: Duke University Press.

Labov, William. 1994. Principles of linguistic change. Volume 1: Internal factors. Oxford: Blackwell.

Labov, William, Sharon Ash, and Charles Boberg. 2006. The Atlas of North American English: Phonetics, phonology and sound change . Berlin: Mouton de Gruyter. https://doi.org/10.1515/9783110167467 .

Book   Google Scholar  

Lass, Roger. 1992. What, if anything, was the Great Vowel Shift? In History of Englishes: New methods and interpretations in historical linguistics , ed. Matti Rissanen, Ossi Ihalainen, Terttu Nevalainen, and Irma Taavitsainen, 145–155. Berlin: Mouton de Gruyter.

Lindqvist, Christer. 2003. Thesen zur Kausalität und Chronologie einiger färöischer Lautgesetze. Arkiv för nordisk filologi 118: 89–178.

Lindsey, Geoff. 2019. English after RP: Standard British pronunciation today . London: Palgrave Macmillan.

Maguire, Warren, Lynn Clark, and Kevin Watson. 2013. Introduction: What are mergers and can they be reversed? English Language and Linguistics 17 (2): 229–239. https://doi.org/10.1017/S1360674313000014 .

Martinet, André. 1952. Function, structure, and sound change. Word 8 (1): 1–32. https://doi.org/10.1080/00437956.1952.11659416 .

Minkova, Donka. 2014. A historical phonology of English . Edinburgh: Edinburgh University Press.

Mitchell, Alexander G. 1958. Spoken English . London: Macmillan.

Moulton, William G. 1962. Dialect geography and the concept of phonological space. Word 18 (1–3): 23–32. https://doi.org/10.1080/00437956.1962.11659763 .

Natvig, David, and Joseph Salmons. 2021. Connecting structure and variation in sound change. Cadernos de Linguística 2 (1): 1–20. https://doi.org/10.25189/2675-4916.2021.V2.N1.ID314 .

Nycz, Jennifer. 2013. New contrast acquisition: Methodological issues and theoretical implications. English Language and Linguistics 17 (2): 325–357. https://doi.org/10.1017/S1360674313000051 .

Podesva, Robert J. 2011. The California vowel shift and gay identity. American Speech 86 (1): 32–51. https://doi.org/10.1215/00031283-1277501 .

Podesva, Robert J., Annette D’Onofrio, Janneke Van Hofwegen, and Seung Kyung Kim. 2015. Country ideology and the California Vowel Shift. Language Variation and Change 27 (2): 157–186. https://doi.org/10.1017/S095439451500006X .

Roeder, Rebecca V., and Matt Hunt Gardner. 2013. The phonology of the Canadian Shift revisited: Thunder Bay Cape Breton. University of Pennsylvania Working Papers in Linguistics 19 (2): 161–170. https://repository.upenn.edu/pwpl/vol19/iss2/18 .

Roeder, Rebecca V., and Matt Hunt Gardner. 2022. A unified account of the low back merger shift. Paper presented at Methods in Dialectology XVII, Mainz, August 1.

Siegel, Jeff. 2010. Second dialect acquisition . Cambridge: Cambridge University Press.

Smith, Jeremy J. 2007. Sound change and the history of English . Oxford: Oxford University Press.

Sóskuthy, Márton. 2015. Understanding change through stability: A computational study of sound change actuation. Lingua 163: 40–60. https://doi.org/10.1016/j.lingua.2015.05.010 .

Stockwell, Robert P., and Donka Minkova. 1988. The English vowel shift: Problems of coherence and explanation. In Luick revisited: Papers read at the Luick-Symposium at Schloß Liechtenstein, 15.–18.9.1985 , ed. Dieter Kastovsky and Gero Bauer, 355–394. Tübingen: Narr.

Todd, Simon, Janet B. Pierrehumbert, and Jennifer Hay. 2019. Word frequency effects in sound change as a consequence of perceptual asymmetries: An exemplar-based model. Cognition 185: 1–20. https://doi.org/10.1016/j.cognition.2019.01.004 .

Torgersen, Eivind, and Paul Kerswill. 2004. Internal and external motivation in phonetic change: Dialect levelling outcomes for an English vowel shift. Journal of Sociolinguistics 8 (1): 23–53. https://doi.org/10.1111/j.1467-9841.2004.00250.x .

Trudgill, Peter, Elizabeth Gordon, Gillian Lewis, and Margaret MacLagan. 2000. The role of drift in the formation of native-speaker Southern Hemisphere Englishes: Some New Zealand evidence. Diachronica 17 (1): 111–138. https://doi.org/10.1075/dia.17.1.06tru .

Turton, Danielle, and Maciej Baranowski. 2021. Not quite the same: The social stratification and phonetic conditioning of the foot – strut vowels in Manchester. Journal of Linguistics 57 (1): 163–201. https://doi.org/10.1017/S0222226720000122 .

van Loey, A. 1970. Schönfelds historische grammatica van het Nederlands: Klankleer, vormleer, woordvorming , 8th ed. Zutphen: Thieme.

Wang, William S.-Y. 1969. Competing changes as a cause of residue. Language 45 (1): 9–25. https://doi.org/10.2307/411748 .

Watts, Richard J. 2011. Language myths and the history of English . Oxford: Oxford University Press.

Wedel, Andrew, Abby Kaplan, and Scott Jackson. 2013. High functional load inhibits phonological contrast loss: A corpus study. Cognition 128: 179–186. https://doi.org/10.1016/j.cognition.2013.03.002 .

Wells, J.C. 1982. Accents of English . Cambridge: Cambridge University Press.

Westerberg, Fabienne. 2019. Swedish “Viby-i”: Acoustics, articulation, and variation. In Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 , ed. Sasha Calhoun, Paola Escudero, Marija Tabain, and Paul Warren, 3696–3700. Canberra: Australasian Speech Science and Technology Association Inc.

Download references

Author information

Authors and affiliations.

University of Groningen, Groningen, The Netherlands

Remco Knooihuizen

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Remco Knooihuizen .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Knooihuizen, R. (2023). Phonological Change. In: The Linguistics of the History of English. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-41692-7_4

Download citation

DOI : https://doi.org/10.1007/978-3-031-41692-7_4

Published : 28 October 2023

Publisher Name : Palgrave Macmillan, Cham

Print ISBN : 978-3-031-41691-0

Online ISBN : 978-3-031-41692-7

eBook Packages : Social Sciences Social Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Logo for Open Library Publishing Platform

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

4.1 Phonemes and Contrast

Check Yourself

Video script.

In the last couple of chapters, we’ve seen lots of ways that sounds can differ from each other: they can vary in voicing, in place and manner of articulation, in pitch or length. Within the mental grammar of each language, some of these variations and meaningful and some are not. Each language organizes these meaningful variations in different ways. Let’s look at some examples.

In the English word please , I could pronounce it with an ordinary voiced [l]: [p h liz] it would be a little unnatural but it’s possible. Or, because of perseveratory assimilation, I could devoice that [l] and pronounce it [p h l̥iz]. We’ve got two slightly different sounds here: both are alveolar lateral approximants, but one is voiced and one is voiceless. But if I pronounce the word [p h liz] or [p h l̥iz], it means the same thing. The voicing difference in this environment is not meaningful in English and most people never notice if the [l] is voiced or not.

In the words van and fan , each word begins with a labio-dental fricative. In van , the fricative is voiced and in fan it’s voiceless. In this case, the difference in voicing is meaningful: it leads to an entirely different word, and all fluent speakers notice this difference! Within the mental grammar of English speakers, the difference between voiced and voiceless sounds is meaningful in some environments but not in others.

Here’s another example. I could pronounce the word free with the ordinary high front tense vowel [i]. Or I could make the vowel extra long, freeeee . (Notice that we indicate a long sound with this diacritic [iː] that looks a bit like a colon.) But this difference is not meaningful: In English, both [fri] and [friː] are the same word. In Italian, a length difference is meaningful. The word fato means “fate”. But if I take that alveolar stop and make it long, the word fatto means a “fact”. The difference in the length of the stop makes [fatɔ] and [fatːɔ] two different words. (N.B., In the video there’s an error in how these two words are transcribed; it should be with the [a] vowel, not the [æ] vowel.)

So here’s the pattern that we’re observing. Sounds can vary; they can be different from each other. Some variation is meaningful within the grammar of a given language, and some variation is not.

Until now, we’ve been concentrating on phonetics: how sounds are made and what they sound like.  We’re now starting to think about phonology , which looks at how sounds are organized within the mental grammar of each language: which phonetic differences are meaningful, which are predictable, which ones are possible and which ones are impossible within each language. The core principle in phonology is the idea of contrast . Say we have two sounds that are different from each other.  If the difference between those two sounds leads to a difference in meaning in a given language, then we say that those two sounds contrast in that language.

So for example, the difference between fan and van is a phonetic difference in voicing. That phonetic difference leads to a substantial difference in meaning in English, so we say that /f/ and /v/ are contrastive in English. And if two sounds are contrastive in a given language, then those two sounds are considered two different phonemes in that language.

So here’s a new term in linguistics. What is a phoneme ? A phoneme is something that exists in your mind.  It’s a mental category, into which your mind groups sounds that are phonetically similar and gives them all the same label. That mental category contains memories of every time you’ve heard a given sound and labelled it as a member of that category. You could think of a phoneme like a shopping bag in your mind.  Every time you hear the segment [f], your mental grammar categorizes it by putting it in bag labelled /f/. /v/ contrasts with /f/ — it’s a different phoneme, so every time you hear that [v], your mind puts it in a different bag, one labelled /v/.

If we look inside that shopping bag, inside the mental category, we might find some phonetic variation . But if the variation is not meaningful, not contrastive, our mental grammar does not treat those different segments as different phonemes. In English, we have a phonemic category for /l/, so whenever we hear the segment [l] we store it in our memory as that phoneme. But voiceless [l̥] is not contrastive: it doesn’t change the meaning of a word, so when we hear voiceless [l̥] we also put it in the same category in our mind. And when we hear a syllabic [l̩], that’s not contrastive either, so we put that in the same category. All of those [l]s are a little different from each other, phonetically, but those phonetic differences are not contrastive because they don’t lead to a change in meaning, so all of those [l]s are members of a single phoneme category in English.

Now, as a linguist, I can tell you that voiceless [f] and voiced [v] are two different phonemes in English, while voiceless [l̥] and voiced [l] are both different members of the same phoneme category in English. But as part of your developing skills in linguistics, you want to be able to figure these things out for yourself. Our question now is, how can we tell if two phonetically different sounds are phonemically contrastive? What evidence would we need? Remember that mental grammar is in the mind — we can’t observe it directly. So what evidence would we want to observe in the language that will allow us to draw conclusions about the mental grammar?

If we observe that a difference between two sounds — a phonetic difference — also leads to a difference in meaning, then we can conclude that the phonetic difference is also a phonemic difference in that language. So our question really is, how do we find differences in meaning?

What we do is look for a minimal pair . We want to find two words that are identical in every way except for the two segments that we’re considering. So the two words are minimally different: the only phonetic difference between them is the difference that we’re interested in. If we can find such a pair, where the minimal phonetic difference leads to a difference in meaning, it’s contrastive, then we can conclude that the phonetic difference between them is a phonemic difference.

We’ve already seen one example of a minimal pair:   fan and van are identical in every way except for the first segment. The phonetic difference between [f] and [v] is contrastive; it changes the meaning of the word, so we conclude that /f/ and /v/ are two different phonemes. Can you think of other minimal pairs that give evidence for the phonemic contrast between /f/ and /v/? Take a minute, pause the video, and try to think of some.

Here are some more minimal pairs that I thought of for /f/ and /v/: vine and fine , veal and feel . Minimal pairs don’t have to have the segments that we’re considering at the beginning of the word.  Here are some pairs that contrast at the end of the word: have and half , serve and surf . Or the contrast can occur in the middle of the word, like in reviews and refuse . What’s important is that the two words are minimally different: they are the same in all their segments except for the two that we’re considering. And it’s also important to notice that the minimal difference is in the IPA transcription of the word, not in its spelling.

So we’ve got plenty of evidence from all these minimal pairs in English that the phonetic difference between /f/ and /v/ leads to a meaning difference in English, so we can conclude that, in English, /f/ and /v/ are two different phonemes.

Essentials of Linguistics Copyright © 2018 by Catherine Anderson is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

2.2 The Articulatory System

We speak by moving parts of our vocal tract (See Figure 2.1). These include the lips, teeth, mouth, tongue and larynx. The larynx or voice box is the basis for all the sounds we produce. It modified the airflow to produce different frequencies of sound. By changing the shape of the vocal tract and airflow, we are able to produce all the phonemes of spoken language. There are two basic categories of sound that can be classified in terms of the way in which the flow of air through the vocal tract is modified. Phonemes that are produced without any obstruction to the flow of air are called vowels . Phonemes that are produced with some kind of modification to the airflow are called consonants . Of course, nature is not as clear-cut as all that and we do make some sounds that are somewhere in between these two categories. These are called semivowels and are usually classified alongside consonants as they behave similar to them.

image description linked to in caption

While vowels do not require any modifications to the airflow, the production of consonants requires it. This obstruction is produced by bringing some parts of the vocal tract into contact. These places of contact are known as places of articulation . As seen in Figure 2.2, there are a number of places of articulation for the lips, teeth, and tongue. Sometimes the articulators touch each other as in the case of the two lips coming together to produce [b]. At other times, two articulators come into contact as when the lower lip folds back into the upper teeth to produce [f]. The tongue can touch different parts of the vocal tract to produce a variety of consonants by touching the teeth, the alveolar ridge, hard palate or soft palate (or velum).

image description linked to in caption

While these places of articulation are sufficient for describing how English phonemes are produced, other languages also make use of the glottis and epiglottis among other parts of the vocal tract. We will explore these in more detail later.

The Vocal Tract

modification of phonemes in speech

Fill in the blanks with parts of vocal tract:

  • Hard palate
  •  Soft palate
  • Nasal cavity
  • Alveolar ridge
  • Vocal cords

To check your answers, navigate to the above link to view the interactive version of this activity.

Places of Articulation

modification of phonemes in speech

Image description

Figure 2.1 Parts of the Human Vocal Tract

A labeled image of the anatomical components of the human vocal tract, including the nasal cavity, hard palate, soft palate or velum, alveolar ridge, lips, teeth, tongue, uvula, esophagus, trachea, and the parts of the larynx, which include the epiglottis, vocal cords, and glottis.

[Return to place in the text (Figure 2.1)]

Figure 2.2 Places of Articulation

A labeled image illustrating the anatomical components of the human vocal tract that are involved in English phonemes. These include the glottal, velar, palatal, dental, and labial structures.

[Return to place in the text (Figure 2.2)]

Media Attributions

  • Figure 2.1 Parts of the Human Vocal Tract is an edited version of Mouth Anatomy by Patrick J. Lynch, medical illustrator, is licensed under a  CC BY 2.5 licence .
  • Figure 2.2 Places of Articulation is an edited version of Mouth Anatomy by Patrick J. Lynch, medical illustrator, is licensed under a  CC BY 2.5 licence .

A speech sound that is produced without complete or partial closure of the vocal tract.

A speech sound that is produced with complete or partial closure of the vocal tract.

A consonant that is phonetically similar to a vowel but functions as a consonant. Also known as a glide.

The point of contact between the articulators.

Psychology of Language Copyright © 2021 by Dinesh Ramoo is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Share This Book

modification of phonemes in speech

Copyright © 2024 SpeechPathology.com - All Rights Reserved

Facebook tracking pixel

  • Difficulty with /r/ and Techniques for Dealing with this Phoneme

Julie Hoffmann, M.A.,CCC-SLP

April 21, 2003.

  • Ask the Experts
  • Articulation, Phonology, and Speech Sound Disorders

The /r/ phoneme is one of the most difficult phonemes to remediate for clients with persistent, long term /r/ problems. Identifying the exact nature of the problem with the /r/ production will allow you to choose appropriate remediation strategies for your client. Typical problems with incorrect /r/ productions include: rounding the lips, incorrect tongue placement, lack of tension with the tongue, tongue is too low in the oral cavity, use of a tense jaw, poor tongue-jaw differentiation, jaw instability, and incorrect productions patterned over time. Here are several therapy facilitation techniques that I have found successful for clients with persistent /r/ problems:

  • Teaching general awareness of the articulators (ie. tongue, lips, oral cavity) and their functions with stimulation (flavored tongue depressors, pretzel sticks, small suckers, toothettes) and visual cues (mirror, tongue drawings). Decreasing hyposensitivity by brushing the sides of the tongue and inside of the upper molars with various textures (ie. small toothbrush, toothette, tongue depressor, Popsicles) before and during practice of /r/ targets.
  • Eliminate lip rounding by having the client smile during /r/ productions. You could also place a small bite block (ie. coffee stirrer) between the lips while smiling, as the bite block will fall out if the lips are rounded (use a mirror so client can visually monitor lip rounding/retracting).
  • Create tongue tension by placing a wet toothette on the back of the tongue and directing the client to close his mouth and push the toothette up with the tongue (squeezing out all the water). Direct the client to complete this task for several trials before sound practice. Also, tongue tension increases by having the client produce the /r/ while pushing against a table/wall or saying the /r/ while lifting the chair he is sitting in.
  • Be very picky with your target word choices for /r/ initial words. Choose words with velars in the final position to increase the use of /r/ (ie. rake, rug, rock). Produce a short, quick /r/ (do not prolong), pause for a second, overemphasize the vowel after the /r/, then finish out the word.
  • Tongue/jaw differentiation tasks to improve jaw stability, which in turn allows the client to achieve correct tongue placement. Use a mirror and have the client open and close the mouth slowly with no head movement or lateralizing of the jaw. The client could increase jaw stability by opening and closing the mouth in increments for better control as well. Direct the client to open his mouth, leaving it open with a stable jaw, and slowly moving the tongue tip to the alveolar ridge and then behind the lower front teeth. Complete several trials for these tasks. You could also use a bite block (coffee stirrer) placed between the molars on one side to assure jaw stability for the client until he can do this on his own.
  • Teach the bunched /r/ (high back) which includes humping up back of tongue for silent /k/; having the sides of the back of the tongue touch the insides of upper back molars and relaxing the jaw).
  • Clients with persistent /r/ problems often benefit from the introduction of the retroflex /r/. Teach the retroflex /r/ (curled) which includes placing the tongue tip behind the upper front teeth; curling the tongue tip backward without touching the roof of the mouth; the lateral sides of the tongue should touch the insides of the upper back molars; and the jaw should be slightly lowered. The retroflex /r/ can also be facilitated by producing an /l/ with a slightly lowered jaw and sliding the tongue tip back farther and farther until you hear an /r/ production. If the client has a short frenulum, then the retroflex /r/ will be difficult. Frequently, in time, the retroflex /r/ naturally changes to a high back /r/.
  • Shaping / sound modifications: using phonemes /l/, /n/, /d/, /w/, /g/, ''sh', ''y', /i/, /a/ to shape the /r/ sound.
  • Use coarticulation if a client is successful with /r/ in the initial or final position of words, use this as a facilitation technique. For example, successful initial /r/ productions could increase the final position /r/ (ie. bear-red, car-read) due to anticipatory behaviors for the upcoming initial /r/. You would gradually work the /r/ initial word during practice as the final position /r/ emerges. Also, you could try the /kr-/ and /gr-/ blends for initial success.
  • Drill, drill, drill. Expect accuracy. Once the client is successful with the /r/ production, increase complexity by establishing the /r/ in other contexts and positions. Encourage the client to ''feel' the difference with the /r/ productions.

Related Courses

Treatment approach considerations for school-aged children with speech sound disorders, course: #9472 level: intermediate 1 hour, back to basics: down syndrome, course: #8975 level: introductory 1 hour, 20q: dynamics of school-based speech and language therapy variables, course: #10002 level: advanced 1 hour, sleuthing for /s/ and /r/: facilitating strategies for residual sound errors, course: #9237 level: introductory 2 hours, 20q: a continuum approach for sorting out processing disorders, course: #10008 level: intermediate 1 hour.

Our site uses cookies to improve your experience. By using our site, you agree to our Privacy Policy .

Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center

Voskhod. The first human in space. Three stills from an external movie camera on the Soviet Voskhod 2 records pilot Aleksey Leonov historic 10 min. spacewalk, March 18, 1965. Leonov extravehicular activities (EVA) was the first human to ever walk in space

Our editors will review what you’ve submitted and determine whether to revise the article.

  • Academia - Definition of the phoneme and its functions
  • CORE - The Status of the Concept of ‘Phoneme’ in Psycholinguistics
  • University of Birmingham - School of Computer Science - Phones and Phonemes
  • Literary Devices - Phoneme
  • Social Sciences LibreTexts - Phonemes

phoneme , in linguistics , smallest unit of speech distinguishing one word (or word element) from another, as the element p in “tap,” which separates that word from “tab,” “tag,” and “tan.” A phoneme may have more than one variant, called an allophone ( q.v. ), which functions as a single sound; for example, the p ’s of “pat,” “spat,” and “tap” differ slightly phonetically, but that difference, determined by context , has no significance in English. In some languages, where the variant sounds of p can change meaning, they are classified as separate phonemes— e.g., in Thai the aspirated p (pronounced with an accompanying puff of air) and unaspirated p are distinguished one from the other.

Phonemes are based on spoken language and may be recorded with special symbols, such as those of the International Phonetic Alphabet . In transcription, linguists conventionally place symbols for phonemes between slash marks: /p/. The term phoneme is usually restricted to vowels and consonants, but some linguists extend its application to cover phonologically relevant differences of pitch, stress , and rhythm. Nowadays the phoneme often has a less central place in phonological theory than it used to have, especially in American linguistics. Many linguists regard the phoneme as a set of simultaneous distinctive features rather than as an unanalyzable unit.

Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more: https://www.cambridge.org/universitypress/about-us/news-and-blogs/cambridge-university-press-publishing-update-following-technical-disruption

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

  • > English Phonology
  • > Speech sounds and their production

modification of phonemes in speech

Book contents

  • Frontmatter
  • 1 Speech sounds and their production
  • 2 Towards a sound system for English: consonant phonemes
  • 3 Some vowel systems of English
  • 4 Phonological features, part 1: the classification of English vowel phonemes
  • 5 Phonological features, part 2: the consonant system
  • 6 Syllables
  • 7 Word stress
  • 8 Phonetic representations: the realisations of phonemes
  • 9 Phrases, sentences and the phonology of connected speech
  • 10 Representations and derivations

1 - Speech sounds and their production

Published online by Cambridge University Press:  05 June 2012

Organs and processes

Most speech is produced by an air stream that originates in the lungs and is pushed upwards through the trachea (the windpipe) and the oral and nasal cavities . During its passage, the air stream is modified by the various organs of speech. Each such modification has different acoustic effects, which are used for the differentiation of sounds. The production of a speech sound may be divided into four separate but interrelated processes: the initiation of the air stream, normally in the lungs; its phonation in the larynx through the operation of the vocal folds; its direction by the velum into either the oral cavity or the nasal cavity (the oro-nasal process); and finally its articulation , mainly by the tongue, in the oral cavity. We shall deal with each of the four processes in turn. (See figure 1.1.)

The initiation process

The operation of the lungs is familiar through their primary function in the breathing process: contraction of the intercostal muscles and lowering of the diaphragm causes the chest volume to increase and air is sucked into the lungs through the trachea. When the process is reversed, air will escape – again through the trachea. Apart from recurring at regular intervals as breath, this air stream provides the source of energy for speech. In speech, the rate of the air flow is not constant; rather, the air stream pulsates as the result of variation in the activity of the chest muscles.

Access options

Save book to kindle.

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • Speech sounds and their production
  • Heinz J. Giegerich , University of Edinburgh
  • Book: English Phonology
  • Online publication: 05 June 2012
  • Chapter DOI: https://doi.org/10.1017/CBO9781139166126.002

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

American Speech-Language-Hearing Association

American Speech-Language-Hearing Association

  • Certification
  • Publications
  • Continuing Education
  • Practice Management
  • Audiologists
  • Speech-Language Pathologists
  • Academic & Faculty
  • Audiology & SLP Assistants

Speech Sound Disorders-Articulation and Phonology

View All Portal Topics

See the Speech Sound Disorders Evidence Map for summaries of the available research on this topic.

The scope of this page is speech sound disorders with no known cause—historically called articulation and phonological disorders —in preschool and school-age children (ages 3–21).

Information about speech sound problems related to motor/neurological disorders, structural abnormalities, and sensory/perceptual disorders (e.g., hearing loss) is not addressed in this page.

See ASHA's Practice Portal pages on Childhood Apraxia of Speech and Cleft Lip and Palate for information about speech sound problems associated with these two disorders. A Practice Portal page on dysarthria in children will be developed in the future.

Speech Sound Disorders

Speech sound disorders is an umbrella term referring to any difficulty or combination of difficulties with perception, motor production, or phonological representation of speech sounds and speech segments—including phonotactic rules governing permissible speech sound sequences in a language.

Speech sound disorders can be organic or functional in nature. Organic speech sound disorders result from an underlying motor/neurological, structural, or sensory/perceptual cause. Functional speech sound disorders are idiopathic—they have no known cause. See figure below.

Speech Sound Disorders Umbrella

Organic Speech Sound Disorders

Organic speech sound disorders include those resulting from motor/neurological disorders (e.g., childhood apraxia of speech and dysarthria), structural abnormalities (e.g., cleft lip/palate and other structural deficits or anomalies), and sensory/perceptual disorders (e.g., hearing loss).

Functional Speech Sound Disorders

Functional speech sound disorders include those related to the motor production of speech sounds and those related to the linguistic aspects of speech production. Historically, these disorders are referred to as articulation disorders and phonological disorders , respectively. Articulation disorders focus on errors (e.g., distortions and substitutions) in production of individual speech sounds. Phonological disorders focus on predictable, rule-based errors (e.g., fronting, stopping, and final consonant deletion) that affect more than one sound. It is often difficult to cleanly differentiate between articulation and phonological disorders; therefore, many researchers and clinicians prefer to use the broader term, "speech sound disorder," when referring to speech errors of unknown cause. See Bernthal, Bankson, and Flipsen (2017) and Peña-Brooks and Hegde (2015) for relevant discussions.

This Practice Portal page focuses on functional speech sound disorders. The broad term, "speech sound disorder(s)," is used throughout; articulation error types and phonological error patterns within this diagnostic category are described as needed for clarity.

Procedures and approaches detailed in this page may also be appropriate for assessing and treating organic speech sound disorders. See Speech Characteristics: Selected Populations [PDF] for a brief summary of selected populations and characteristic speech problems.

Incidence and Prevalence

The incidence of speech sound disorders refers to the number of new cases identified in a specified period. The prevalence of speech sound disorders refers to the number of children who are living with speech problems in a given time period.

Estimated prevalence rates of speech sound disorders vary greatly due to the inconsistent classifications of the disorders and the variance of ages studied. The following data reflect the variability:

  • Overall, 2.3% to 24.6% of school-aged children were estimated to have speech delay or speech sound disorders (Black, Vahratian, & Hoffman, 2015; Law, Boyle, Harris, Harkness, & Nye, 2000; Shriberg, Tomblin, & McSweeny, 1999; Wren, Miller, Peters, Emond, & Roulstone, 2016).
  • A 2012 survey from the National Center for Health Statistics estimated that, among children with a communication disorder, 48.1% of 3- to 10-year old children and 24.4% of 11- to 17-year old children had speech sound problems only. Parents reported that 67.6% of children with speech problems received speech intervention services (Black et al., 2015).
  • Residual or persistent speech errors were estimated to occur in 1% to 2% of older children and adults (Flipsen, 2015).
  • Reports estimated that speech sound disorders are more prevalent in boys than in girls, with a ratio ranging from 1.5:1.0 to 1.8:1.0 (Shriberg et al., 1999; Wren et al., 2016).
  • Prevalence rates were estimated to be 5.3% in African American children and 3.8% in White children (Shriberg et al., 1999).
  • Reports estimated that 11% to 40% of children with speech sound disorders had concomitant language impairment (Eadie et al., 2015; Shriberg et al., 1999).
  • Poor speech sound production skills in kindergarten children have been associated with lower literacy outcomes (Overby, Trainin, Smit, Bernthal, & Nelson, 2012). Estimates reported a greater likelihood of reading disorders (relative risk: 2.5) in children with a preschool history of speech sound disorders (Peterson, Pennington, Shriberg, & Boada, 2009).

Signs and Symptoms

Signs and symptoms of functional speech sound disorders include the following:

  • omissions/deletions —certain sounds are omitted or deleted (e.g., "cu" for "cup" and "poon" for "spoon")
  • substitutions —one or more sounds are substituted, which may result in loss of phonemic contrast (e.g., "thing" for "sing" and "wabbit" for "rabbit")
  • additions —one or more extra sounds are added or inserted into a word (e.g., "buhlack" for "black")
  • distortions —sounds are altered or changed (e.g., a lateral "s")
  • syllable-level errors —weak syllables are deleted (e.g., "tephone" for "telephone")

Signs and symptoms may occur as independent articulation errors or as phonological rule-based error patterns (see ASHA's resource on selected phonological processes [patterns] for examples). In addition to these common rule-based error patterns, idiosyncratic error patterns can also occur. For example, a child might substitute many sounds with a favorite or default sound, resulting in a considerable number of homonyms (e.g., shore, sore, chore, and tore might all be pronounced as door ; Grunwell, 1987; Williams, 2003a).

Influence of Accent

An accent is the unique way that speech is pronounced by a group of people speaking the same language and is a natural part of spoken language. Accents may be regional; for example, someone from New York may sound different than someone from South Carolina. Foreign accents occur when a set of phonetic traits of one language are carried over when a person learns a new language. The first language acquired by a bilingual or multilingual individual can influence the pronunciation of speech sounds and the acquisition of phonotactic rules in subsequently acquired languages. No accent is "better" than another. Accents, like dialects, are not speech or language disorders but, rather, only reflect differences. See ASHA's Practice Portal pages on Multilingual Service Delivery in Audiology and Speech-Language Pathology and Cultural Responsiveness .

Influence of Dialect

Not all sound substitutions and omissions are speech errors. Instead, they may be related to a feature of a speaker's dialect (a rule-governed language system that reflects the regional and social background of its speakers). Dialectal variations of a language may cross all linguistic parameters, including phonology, morphology, syntax, semantics, and pragmatics. An example of a dialectal variation in phonology occurs with speakers of African American English (AAE) when a "d" sound is used for a "th" sound (e.g., "dis" for "this"). This variation is not evidence of a speech sound disorder but, rather, one of the phonological features of AAE.

Speech-language pathologists (SLPs) must distinguish between dialectal differences and communicative disorders and must

  • recognize all dialects as being rule-governed linguistic systems;
  • understand the rules and linguistic features of dialects represented by their clientele; and
  • be familiar with nondiscriminatory testing and dynamic assessment procedures, such as identifying potential sources of test bias, administering and scoring standardized tests using alternative methods, and analyzing test results in light of existing information regarding dialect use (see, e.g., McLeod, Verdon, & The International Expert Panel on Multilingual Children's Speech, 2017).

See ASHA's Practice Portal pages on Multilingual Service Delivery in Audiology and Speech-Language Pathology and Cultural Responsiveness .

The cause of functional speech sound disorders is not known; however, some risk factors have been investigated.

Frequently reported risk factors include the following:

  • Gender —the incidence of speech sound disorders is higher in males than in females (e.g., Everhart, 1960; Morley, 1952; Shriberg et al., 1999).
  • Pre- and perinatal problems —factors such as maternal stress or infections during pregnancy, complications during delivery, preterm delivery, and low birthweight were found to be associated with delay in speech sound acquisition and with speech sound disorders (e.g., Byers Brown, Bendersky, & Chapman, 1986; Fox, Dodd, & Howard, 2002).
  • Family history —children who have family members (parents or siblings) with speech and/or language difficulties were more likely to have a speech disorder (e.g., Campbell et al., 2003; Felsenfeld, McGue, & Broen, 1995; Fox et al., 2002; Shriberg & Kwiatkowski, 1994).
  • Persistent otitis media with effusion —persistent otitis media with effusion (often associated with hearing loss) has been associated with impaired speech development (Fox et al., 2002; Silva, Chalmers, & Stewart, 1986; Teele, Klein, Chase, Menyuk, & Rosner, 1990).

Roles and Responsibilities

Speech-language pathologists (SLPs) play a central role in the screening, assessment, diagnosis, and treatment of persons with speech sound disorders. The professional roles and activities in speech-language pathology include clinical/educational services (diagnosis, assessment, planning, and treatment); prevention and advocacy; and education, administration, and research. See ASHA's Scope of Practice in Speech-Language Pathology (ASHA, 2016).

Appropriate roles for SLPs include the following:

  • Providing prevention information to individuals and groups known to be at risk for speech sound disorders, as well as to individuals working with those at risk
  • Educating other professionals on the needs of persons with speech sound disorders and the role of SLPs in diagnosing and managing speech sound disorders
  • Screening individuals who present with speech sound difficulties and determining the need for further assessment and/or referral for other services
  • Recognizing that students with speech sound disorders have heightened risks for later language and literacy problems
  • Conducting a culturally and linguistically relevant comprehensive assessment of speech, language, and communication
  • Taking into consideration the rules of a spoken accent or dialect, typical dual-language acquisition from birth, and sequential second-language acquisition to distinguish difference from disorder
  • Diagnosing the presence or absence of a speech sound disorder
  • Referring to and collaborating with other professionals to rule out other conditions, determine etiology, and facilitate access to comprehensive services
  • Making decisions about the management of speech sound disorders
  • Making decisions about eligibility for services, based on the presence of a speech sound disorder
  • Developing treatment plans, providing intervention and support services, documenting progress, and determining appropriate service delivery approaches and dismissal criteria
  • Counseling persons with speech sound disorders and their families/caregivers regarding communication-related issues and providing education aimed at preventing further complications related to speech sound disorders
  • Serving as an integral member of an interdisciplinary team working with individuals with speech sound disorders and their families/caregivers (see ASHA's resource on interprofessional education/interprofessional practice [IPE/IPP] )
  • Consulting and collaborating with professionals, family members, caregivers, and others to facilitate program development and to provide supervision, evaluation, and/or expert testimony (see ASHA's resource on person- and family-centered care )
  • Remaining informed of research in the area of speech sound disorders, helping advance the knowledge base related to the nature and treatment of these disorders, and using evidence-based research to guide intervention
  • Advocating for individuals with speech sound disorders and their families at the local, state, and national levels

As indicated in the Code of Ethics (ASHA, 2023), SLPs who serve this population should be specifically educated and appropriately trained to do so.

See the Assessment section of the Speech Sound Disorders Evidence Map for pertinent scientific evidence, expert opinion, and client/caregiver perspective.

Screening is conducted whenever a speech sound disorder is suspected or as part of a comprehensive speech and language evaluation for a child with communication concerns. The purpose of the screening is to identify individuals who require further speech-language assessment and/or referral for other professional services.

Screening typically includes

  • screening of individual speech sounds in single words and in connected speech (using formal and or informal screening measures);
  • screening of oral motor functioning (e.g., strength and range of motion of oral musculature);
  • orofacial examination to assess facial symmetry and identify possible structural bases for speech sound disorders (e.g., submucous cleft palate, malocclusion, ankyloglossia); and
  • informal assessment of language comprehension and production.

See ASHA's resource on assessment tools, techniques, and data sources .

Screening may result in

  • recommendation to monitor speech and rescreen;
  • referral for multi-tiered systems of support such as response to intervention (RTI) ;
  • referral for a comprehensive speech sound assessment;
  • recommendation for a comprehensive language assessment, if language delay or disorder is suspected;
  • referral to an audiologist for a hearing evaluation, if hearing loss is suspected; and
  • referral for medical or other professional services, as appropriate.

Comprehensive Assessment

The acquisition of speech sounds is a developmental process, and children often demonstrate "typical" errors and phonological patterns during this acquisition period. Developmentally appropriate errors and patterns are taken into consideration during assessment for speech sound disorders in order to differentiate typical errors from those that are unusual or not age appropriate.

The comprehensive assessment protocol for speech sound disorders may include an evaluation of spoken and written language skills, if indicated. See ASHA's Practice Portal pages on Spoken Language Disorders and Written Language Disorders .

Assessment is accomplished using a variety of measures and activities, including both standardized and nonstandardized measures, as well as formal and informal assessment tools. See ASHA's resource on assessment tools, techniques, and data sources .

SLPs select assessments that are culturally and linguistically sensitive, taking into consideration current research and best practice in assessing speech sound disorders in the languages and/or dialect used by the individual (see, e.g., McLeod et al., 2017). Standard scores cannot be reported for assessments that are not normed on a group that is representative of the individual being assessed.

SLPs take into account cultural and linguistic speech differences across communities, including

  • phonemic and allophonic variations of the language(s) and/or dialect(s) used in the community and how those variations affect determination of a disorder or a difference and
  • differences among speech sound disorders, accents, dialects, and patterns of transfer from one language to another. See phonemic inventories and cultural and linguistic information across languages .

Consistent with the World Health Organization's (WHO) International Classification of Functioning, Disability and Health (ICF) framework (ASHA, 2016a; WHO, 2001), a comprehensive assessment is conducted to identify and describe

  • impairments in body structure and function, including underlying strengths and weaknesses in speech sound production and verbal/nonverbal communication;
  • co-morbid deficits or conditions, such as developmental disabilities, medical conditions, or syndromes;
  • limitations in activity and participation, including functional communication, interpersonal interactions with family and peers, and learning;
  • contextual (environmental and personal) factors that serve as barriers to or facilitators of successful communication and life participation; and
  • the impact of communication impairments on quality of life of the child and family.

See ASHA's Person-Centered Focus on Function: Speech Sound Disorder [PDF] for an example of assessment data consistent with ICF.

Assessment may result in

  • diagnosis of a speech sound disorder;
  • description of the characteristics and severity of the disorder;
  • recommendations for intervention targets;
  • identification of factors that might contribute to the speech sound disorder;
  • diagnosis of a spoken language (listening and speaking) disorder;
  • identification of written language (reading and writing) problems;
  • recommendation to monitor reading and writing progress in students with identified speech sound disorders by SLPs and other professionals in the school setting;
  • referral for multi-tiered systems of support such as response to intervention (RTI) to support speech and language development; and
  • referral to other professionals as needed.

Case History

The case history typically includes gathering information about

  • the family's concerns about the child's speech;
  • history of middle ear infections;
  • family history of speech and language difficulties (including reading and writing);
  • languages used in the home;
  • primary language spoken by the child;
  • the family's and other communication partners' perceptions of intelligibility; and
  • the teacher's perception of the child's intelligibility and participation in the school setting and how the child's speech compares with that of peers in the classroom.

See ASHA's Practice Portal page on Cultural Responsiveness for guidance on taking a case history with all clients.

Oral Mechanism Examination

The oral mechanism examination evaluates the structure and function of the speech mechanism to assess whether the system is adequate for speech production. This examination typically includes assessment of

  • dental occlusion and specific tooth deviations;
  • structure of hard and soft palate (clefts, fistulas, bifid uvula); and
  • function (strength and range of motion) of the lips, jaw, tongue, and velum.

Hearing Screening

A hearing screening is conducted during the comprehensive speech sound assessment, if one was not completed during the screening.

Hearing screening typically includes

  • otoscopic inspection of the ear canal and tympanic membrane;
  • pure-tone audiometry; and
  • immittance testing to assess middle ear function.

Speech Sound Assessment

The speech sound assessment uses both standardized assessment instruments and other sampling procedures to evaluate production in single words and connected speech.

Single-word testing provides identifiable units of production and allows most consonants in the language to be elicited in a number of phonetic contexts; however, it may or may not accurately reflect production of the same sounds in connected speech.

Connected speech sampling provides information about production of sounds in connected speech using a variety of talking tasks (e.g., storytelling or retelling, describing pictures, normal conversation about a topic of interest) and with a variety of communication partners (e.g., peers, siblings, parents, and clinician).

Assessment of speech includes evaluation of the following:

  • Accurate productions
  • sounds in various word positions (e.g., initial, within word, and final word position) and in different phonetic contexts;
  • sound combinations such as vowel combinations, consonant clusters, and blends; and
  • syllable shapes —simple CV to complex CCVCC.
  • Speech sound errors
  • consistent sound errors;
  • error types (e.g., deletions, omissions, substitutions, distortions, additions); and
  • error distribution (e.g., position of sound in word).
  • Error patterns (i.e., phonological patterns)—systematic sound changes or simplifications that affect a class of sounds (e.g., fricatives), sound combinations (e.g., consonant clusters), or syllable structures (e.g., complex syllables or multisyllabic words).

See Age of Acquisition of English Consonants (Crowe & McLeaod, 2020) [PDF] and ASHA's resource on selected phonological processes (patterns) .

Severity Assessment

Severity is a qualitative judgment made by the clinician indicating the impact of the child's speech sound disorder on functional communication. It is typically defined along a continuum from mild to severe or profound. There is no clear consensus regarding the best way to determine severity of a speech sound disorder—rating scales and quantitative measures have been used.

A numerical scale or continuum of disability is often used because it is time-efficient. Prezas and Hodson (2010) use a continuum of severity from mild (omissions are rare; few substitutions) to profound (extensive omissions and many substitutions; extremely limited phonemic and phonotactic repertoires). Distortions and assimilations occur in varying degrees at all levels of the continuum.

A quantitative approach (Shriberg & Kwiatkowski, 1982a, 1982b) uses the percentage of consonants correct (PCC) to determine severity on a continuum from mild to severe.

To determine PCC, collect and phonetically transcribe a speech sample. Then count the total number of consonants in the sample and the total number of correct consonants. Use the following formula:

PCC = (correct consonants/total consonants) × 100

A PCC of 85–100 is considered mild, whereas a PCC of less than 50 is considered severe. This approach has been modified to include a total of 10 such indices, including percent vowels correct (PVC; Shriberg, Austin, Lewis, McSweeny, & Wilson, 1997).

Intelligibility Assessment

Intelligibility is a perceptual judgment that is based on how much of the child's spontaneous speech the listener understands. Intelligibility can vary along a continuum ranging from intelligible (message is completely understood) to unintelligible (message is not understood; Bernthal et al., 2017). Intelligibility is frequently used when judging the severity of the child's speech problem (Kent, Miolo, & Bloedel, 1994; Shriberg & Kwiatkowski, 1982b) and can be used to determine the need for intervention.

Intelligibility can vary depending on a number of factors, including

  • the number, type, and frequency of speech sound errors (when present);
  • the speaker's rate, inflection, stress patterns, pauses, voice quality, loudness, and fluency;
  • linguistic factors (e.g., word choice and grammar);
  • complexity of utterance (e.g., single words vs. conversational or connected speech);
  • the listener's familiarity with the speaker's speech pattern;
  • communication environment (e.g., familiar vs. unfamiliar communication partners, one-on-one vs. group conversation);
  • communication cues for listener (e.g., nonverbal cues from the speaker, including gestures and facial expressions); and
  • signal-to-noise ratio (i.e., amount of background noise).

Rating scales and other estimates that are based on perceptual judgments are commonly used to assess intelligibility. For example, rating scales sometimes use numerical ratings like 1 for totally intelligible and 10 for unintelligible, or they use descriptors like not at all, seldom, sometimes, most of the time, or always to indicated how well speech is understood (Ertmer, 2010).

A number of quantitative measures also have been proposed, including calculating the percentage of words understood in conversational speech (e.g., Flipsen, 2006; Shriberg & Kwiatkowski, 1980). See also Kent et al. (1994) for a comprehensive review of procedures for assessing intelligibility.

Coplan and Gleason (1988) developed a standardized intelligibility screener using parent estimates of how intelligible their child sounded to others. On the basis of the data, expected intelligibility cutoff values for typically developing children were as follows:

22 months—50%

37 months—75%

47 months—100%

See the Resources section for resources related to assessing intelligibility and life participation in monolingual children who speak English and in monolingual children who speak languages other than English.

Stimulability Testing

Stimulability is the child's ability to accurately imitate a misarticulated sound when the clinician provides a model. There are few standardized procedures for testing stimulability (Glaspey & Stoel-Gammon, 2007; Powell & Miccio, 1996), although some test batteries include stimulability subtests.

Stimulability testing helps determine

  • how well the child imitates the sound in one or more contexts (e.g., isolation, syllable, word, phrase);
  • the level of cueing necessary to achieve the best production (e.g., auditory model; auditory and visual model; auditory, visual, and verbal model; tactile cues);
  • whether the sound is likely to be acquired without intervention; and
  • which targets are appropriate for therapy (Tyler & Tolbert, 2002).

Speech Perception Testing

Speech perception is the ability to perceive differences between speech sounds. In children with speech sound disorders, speech perception is the child's ability to perceive the difference between the standard production of a sound and his or her own error production—or to perceive the contrast between two phonetically similar sounds (e.g., r/w, s/ʃ, f/θ).

Speech perception abilities can be tested using the following paradigms:

  • Auditory Discrimination —syllable pairs containing a single phoneme contrast are presented, and the child is instructed to say "same" if the paired items sound the same and "different" if they sound different.
  • Picture Identification —the child is shown two to four pictures representing words with minimal phonetic differences. The clinician says one of these words, and the child is asked to point to the correct picture.
  • Speech production–perception task —using sounds that the child is suspected of having difficulty perceiving, picture targets containing these sounds are used as visual cues. The child is asked to judge whether the speaker says the item correctly (e.g., picture of a ship is shown; speaker says, "ship" or "sip"; Locke, 1980).
  • Mispronunciation detection task —using computer-presented picture stimuli and recorded stimulus names (either correct or with a single phoneme error), the child is asked to detect mispronunciations by pointing to a green tick for "correct" or a red cross for "incorrect" (McNeill & Hesketh, 2010).
  • Lexical decision/judgment task —using target pictures and single-word recordings, this task assesses the child's ability to identify words that are pronounced correctly or incorrectly. A picture of the target word (e.g., "lake") is shown, along with a recorded word—either "lake" or a word with a contrasting phoneme (e.g., "wake"). The child points to the picture of the target word if it was pronounced correctly or to an "X" if it was pronounced incorrectly (Rvachew, Nowak, & Cloutier, 2004).

Considerations For Assessing Young Children and/or Children Who Are Reluctant or Have Less Intelligible Speech

Young children might not be able to follow directions for standardized tests, might have limited expressive vocabulary, and might produce words that are unintelligible. Other children, regardless of age, may produce less intelligible speech or be reluctant to speak in an assessment setting.

Strategies for collecting an adequate speech sample with these populations include

  • obtaining a speech sample during the assessment session using play activities;
  • using pictures or toys to elicit a range of consonant sounds;
  • involving parents/caregivers in the session to encourage talking;
  • asking parents/caregivers to supplement data from the assessment session by recording the child's speech at home during spontaneous conversation; and
  • asking parents/caregivers to keep a log of the child's intended words and how these words are pronounced.

Sometimes, the speech sound disorder is so severe that the child's intended message cannot be understood. However, even when a child's speech is unintelligible, it is usually possible to obtain information about his or her speech sound production.

For example:

  • A single-word articulation test provides opportunities for production of identifiable units of sound, and these productions can usually be transcribed.
  • It may be possible to understand and transcribe a spontaneous speech sample by (a) using a structured situation to provide context when obtaining the sample and (b) annotating the recorded sample by repeating the child's utterances, when possible, to facilitate later transcription.

Considerations For Assessing Bilingual/Multilingual Populations

Assessment of a bilingual individual requires an understanding of both linguistic systems because the sound system of one language can influence the sound system of another language. The assessment process must identify whether differences are truly related to a speech sound disorder or are normal variations of speech caused by the first language.

When assessing a bilingual or multilingual individual, clinicians typically

  • gather information, including
  • language history and language use to determine which language(s) should be assessed,
  • phonemic inventory, phonological structure, and syllable structure of the non-English language, and
  • dialect of the individual;
  • assess phonological skills in both languages in single words as well as in connected speech;
  • account for dialectal differences, when present; and
  • identify and assess the child's
  • common substitution patterns (those seen in typically developing children),
  • uncommon substitution patterns (those often seen in individuals with a speech sound disorder), and
  • cross-linguistic effects (the phonological system of one's native language influencing the production of sounds in English, resulting in an accent—that is, phonetic traits from a person's original language (L1) that are carried over to a second language (L2; Fabiano-Smith & Goldstein, 2010).

See phonemic inventories and cultural and linguistic information across languages and ASHA's Practice Portal page on Multilingual Service Delivery in Audiology and Speech-Language Pathology . See the Resources section for information related to assessing intelligibility and life participation in monolingual children who speak English and in monolingual children who speak languages other than English.

Phonological Processing Assessment

Phonological processing is the use of the sounds of one's language (i.e., phonemes) to process spoken and written language (Wagner & Torgesen, 1987). The broad category of phonological processing includes phonological awareness , phonological working memory , and phonological retrieval .

All three components of phonological processing (see definitions below) are important for speech production and for the development of spoken and written language skills. Therefore, it is important to assess phonological processing skills and to monitor the spoken and written language development of children with phonological processing difficulties.

  • Phonological Awareness is the awareness of the sound structure of a language and the ability to consciously analyze and manipulate this structure via a range of tasks, such as speech sound segmentation and blending at the word, onset-rime, syllable, and phonemic levels.
  • Phonological Working Memory involves storing phoneme information in a temporary, short-term memory store (Wagner & Torgesen, 1987). This phonemic information is then readily available for manipulation during phonological awareness tasks. Nonword repetition (e.g., repeat "/pæɡ/") is one example of a phonological working memory task.
  • Phonological Retrieval is the ability to retrieve phonological information from long-term memory. It is typically assessed using rapid naming tasks (e.g., rapid naming of objects, colors, letters, or numbers). This ability to retrieve the phonological information of one's language is integral to phonological awareness.

Language Assessments

Language testing is included in a comprehensive speech sound assessment because of the high incidence of co-occurring language problems in children with speech sound disorders (Shriberg & Austin, 1998).

Spoken Language Assessment (Listening and Speaking)

Typically, the assessment of spoken language begins with a screening of expressive and receptive skills; a full battery is performed if indicated by screening results. See ASHA's Practice Portal page on Spoken Language Disorders for more details.

Written Language Assessment (Reading and Writing)

Difficulties with the speech processing system (e.g., listening, discriminating speech sounds, remembering speech sounds, producing speech sounds) can lead to speech production and phonological awareness difficulties. These difficulties can have a negative impact on the development of reading and writing skills (Anthony et al., 2011; Catts, McIlraith, Bridges, & Nielsen, 2017; Leitão & Fletcher, 2004; Lewis et al., 2011).

For typically developing children, speech production and phonological awareness develop in a mutually supportive way (Carroll, Snowling, Stevenson, & Hulme, 2003; National Institute for Literacy, 2009). As children playfully engage in sound play, they eventually learn to segment words into separate sounds and to "map" sounds onto printed letters.

The understanding that sounds are represented by symbolic code (e.g., letters and letter combinations) is essential for reading and spelling. When reading, children have to be able to segment a written word into individual sounds, based on their knowledge of the code and then blend those sounds together to form a word. When spelling, children have to be able to segment a spoken word into individual sounds and then choose the correct code to represent these sounds (National Institute of Child Health and Human Development, 2000; Pascoe, Stackhouse, & Wells, 2006).

Components of the written language assessment include the following, depending on the child's age and expected stage of written language development:

  • Print Awareness —recognizing that books have a front and back, recognizing that the direction of words is from left to right, and recognizing where words on the page start and stop.
  • Alphabet Knowledge —including naming/printing alphabet letters from A to Z.
  • Sound–Symbol Correspondence —knowing that letters have sounds and knowing the sounds for corresponding letters and letter combinations.
  • Reading Decoding —using sound–symbol knowledge to segment and blend sounds in grade-level words.
  • Spelling —using sound–symbol knowledge to spell grade-level words.
  • Reading Fluency —reading smoothly without frequent or significant pausing.
  • Reading Comprehension —understanding grade-level text, including the ability to make inferences.

See ASHA's Practice Portal page on Written Language Disorders for more details.

See the Treatment section of the Speech Sound Disorders Evidence Map for pertinent scientific evidence, expert opinion, and client/caregiver perspective.

The broad term "speech sound disorder(s)" is used in this Portal page to refer to functional speech sound disorders, including those related to the motor production of speech sounds (articulation) and those related to the linguistic aspects of speech production (phonological).

It is often difficult to cleanly differentiate between articulation and phonological errors or to differentially diagnose these two separate disorders. Nevertheless, we often talk about articulation error types and phonological error types within the broad diagnostic category of speech sound disorder(s). A single child might show both error types, and those specific errors might need different treatment approaches.

Historically, treatments that focus on motor production of speech sounds are called articulation approaches; treatments that focus on the linguistic aspects of speech production are called phonological/language-based approaches.

Articulation approaches target each sound deviation and are often selected by the clinician when the child's errors are assumed to be motor based; the aim is correct production of the target sound(s).

Phonological/language-based approaches target a group of sounds with similar error patterns, although the actual treatment of exemplars of the error pattern may target individual sounds. Phonological approaches are often selected in an effort to help the child internalize phonological rules and generalize these rules to other sounds within the pattern (e.g., final consonant deletion, cluster reduction).

Articulation and phonological/language-based approaches might both be used in therapy with the same individual at different times or for different reasons.

Both approaches for the treatment of speech sound disorders typically involve the following sequence of steps:

  • Establishment —eliciting target sounds and stabilizing production on a voluntary level.
  • Generalization —facilitating carry-over of sound productions at increasingly challenging levels (e.g., syllables, words, phrases/sentences, conversational speaking).
  • Maintenance —stabilizing target sound production and making it more automatic; encouraging self-monitoring of speech and self-correction of errors.

Target Selection

Approaches for selecting initial therapy targets for children with articulation and/or phonological disorders include the following:

  • Developmental —target sounds are selected on the basis of order of acquisition in typically developing children.
  • Complexity —focuses on more complex, linguistically marked phonological elements not in the child's phonological system to encourage cascading, generalized learning of sounds (Gierut, 2007; Storkel, 2018).
  • Dynamic systems —focuses on teaching and stabilizing simple target phonemes that do not introduce new feature contrasts in the child's phonological system to assist in the acquisition of target sounds and more complex targets and features (Rvachew & Bernhardt, 2010).
  • Systemic —focuses on the function of the sound in the child's phonological organization to achieve maximum phonological reorganization with the least amount of intervention. Target selection is based on a distance metric. Targets can be maximally distinct from the child's error in terms of place, voice, and manner and can also be maximally different in terms of manner, place of production, and voicing (Williams, 2003b). See Place, Manner and Voicing Chart for English Consonants (Roth & Worthington, 2018) .
  • Client-specific —selects targets based on factors such as relevance to the child and his or her family (e.g., sound is in child's name), stimulability, and/or visibility when produced (e.g., /f/ vs. /k/).
  • Degree of deviance and impact on intelligibility —selects targets on the basis of errors (e.g., errors of omission; error patterns such as initial consonant deletion) that most effect intelligibility.

See ASHA's Person-Centered Focus on Function: Speech Sound Disorder [PDF] for an example of goal setting consistent with ICF.

Treatment Strategies

In addition to selecting appropriate targets for therapy, SLPs select treatment strategies based on the number of intervention goals to be addressed in each session and the manner in which these goals are implemented. A particular strategy may not be appropriate for all children, and strategies may change throughout the course of intervention as the child's needs change.

"Target attack" strategies include the following:

  • Vertical —intense practice on one or two targets until the child reaches a specific criterion level (usually conversational level) before proceeding to the next target or targets (see, e.g., Fey, 1986).
  • Horizontal —less intense practice on a few targets; multiple targets are addressed individually or interactively in the same session, thus providing exposure to more aspects of the sounds system (see, e.g., Fey, 1986).
  • Cyclical —incorporating elements of both horizontal and vertical structures; the child is provided with practice on a given target or targets for some predetermined period of time before moving on to another target or targets for a predetermined period of time. Practice then cycles through all targets again (see, e.g., Hodson, 2010).

Treatment Options

The following are brief descriptions of both general and specific treatments for children with speech sound disorders. These approaches can be used to treat speech sound problems in a variety of populations. See Speech Characteristics: Selected Populations [PDF] for a brief summary of selected populations and characteristic speech problems.

Treatment selection will depend on a number of factors, including the child's age, the type of speech sound errors, the severity of the disorder, and the degree to which the disorder affects overall intelligibility (Williams, McLeod, & McCauley, 2010). This list is not exhaustive, and inclusion does not imply an endorsement from ASHA.

Contextual Utilization Approaches

Contextual utilization approaches recognize that speech sounds are produced in syllable-based contexts in connected speech and that some (phonemic/phonetic) contexts can facilitate correct production of a particular sound.

Contextual utilization approaches may be helpful for children who use a sound inconsistently and need a method to facilitate consistent production of that sound in other contexts. Instruction for a particular sound is initiated in the syllable context(s) where the sound can be produced correctly (McDonald, 1974). The syllable is used as the building block for practice at more complex levels.

For example, production of a "t" may be facilitated in the context of a high front vowel, as in "tea" (Bernthal et al., 2017). Facilitative contexts or "likely best bets" for production can be identified for voiced, velar, alveolar, and nasal consonants. For example, a "best bet" for nasal consonants is before a low vowel, as in "mad" (Bleile, 2002).

Phonological Contrast Approaches

Phonological contrast approaches are frequently used to address phonological error patterns. They focus on improving phonemic contrasts in the child's speech by emphasizing sound contrasts necessary to differentiate one word from another. Contrast approaches use contrasting word pairs as targets instead of individual sounds.

There are four different contrastive approaches— minimal oppositions, maximal oppositions , treatment of the empty set, and multiple oppositions.

  • Minimal Oppositions (also known as "minimal pairs" therapy)—uses pairs of words that differ by only one phoneme or single feature signaling a change in meaning. Minimal pairs are used to help establish contrasts not present in the child's phonological system (e.g., "door" vs. "sore," "pot" vs. "spot," "key" vs. "tea"; Blache, Parsons, & Humphreys, 1981; Weiner, 1981).
  • Maximal Oppositions —uses pairs of words containing a contrastive sound that is maximally distinct and varies on multiple dimensions (e.g., voice, place, and manner) to teach an unknown sound. For example, "mall" and "call" are maximal pairs because /m/ and /k/ vary on more than one dimension—/m/ is a bilabial voiced nasal, whereas /k/ is a velar voiceless stop (Gierut, 1989, 1990, 1992). See Place, Manner and Voicing Chart for English Consonants (Roth & Worthington, 2018) .
  • Treatment of the Empty Set —similar to the maximal oppositions approach but uses pairs of words containing two maximally opposing sounds (e.g., /r/ and /d/) that are unknown to the child (e.g., "row" vs. "doe" or "ray" vs. "day"; Gierut, 1992).
  • Multiple Oppositions —a variation of the minimal oppositions approach but uses pairs of words contrasting a child's error sound with three or four strategically selected sounds that reflect both maximal classification and maximal distinction (e.g., "door," "four," "chore," and "store," to reduce backing of /d/ to /g/; Williams, 2000a, 2000b).

Complexity Approach

The complexity approach is a speech production approach based on data supporting the view that the use of more complex linguistic stimuli helps promote generalization to untreated but related targets.

The complexity approach grew primarily from the maximal oppositions approach. However, it differs from the maximal oppositions approach in a number of ways. Rather than selecting targets on the basis of features such as voice, place, and manner, the complexity of targets is determined in other ways. These include hierarchies of complexity (e.g., clusters, fricatives, and affricates are more complex than other sound classes) and stimulability (i.e., sounds with the lowest levels of stimulability are most complex). In addition, although the maximal oppositions approach trains targets in contrasting word pairs, the complexity approach does not. See Baker and Williams (2010) and Peña-Brooks and Hegde (2015) for detailed descriptions of the complexity approach.

Core Vocabulary Approach

A core vocabulary approach focuses on whole-word production and is used for children with inconsistent speech sound production who may be resistant to more traditional therapy approaches.

Words selected for practice are those used frequently in the child's functional communication. A list of frequently used words is developed (e.g., based on observation, parent report, and/or teacher report), and a number of words from this list are selected each week for treatment. The child is taught his or her "best" word production, and the words are practiced until consistently produced (Dodd, Holm, Crosbie, & McIntosh, 2006).

Cycles Approach

The cycles approach targets phonological pattern errors and is designed for children with highly unintelligible speech who have extensive omissions, some substitutions, and a restricted use of consonants.

Treatment is scheduled in cycles ranging from 5 to 16 weeks. During each cycle, one or more phonological patterns are targeted. After each cycle has been completed, another cycle begins, targeting one or more different phonological patterns. Recycling of phonological patterns continues until the targeted patterns are present in the child's spontaneous speech (Hodson, 2010; Prezas & Hodson, 2010).

The goal is to approximate the gradual typical phonological development process. There is no predetermined level of mastery of phonemes or phoneme patterns within each cycle; cycles are used to stimulate the emergence of a specific sound or pattern—not to produce mastery of it.

Distinctive Feature Therapy

Distinctive feature therapy focuses on elements of phonemes that are lacking in a child's repertoire (e.g., frication, nasality, voicing, and place of articulation) and is typically used for children who primarily substitute one sound for another. See Place, Manner and Voicing Chart for English Consonants (Roth & Worthington, 2018) .

Distinctive feature therapy uses targets (e.g., minimal pairs) that compare the phonetic elements/features of the target sound with those of its substitution or some other sound contrast. Patterns of features can be identified and targeted; producing one target sound often generalizes to other sounds that share the targeted feature (Blache & Parsons, 1980; Blache et al., 1981; Elbert & McReynolds, 1978; McReynolds & Bennett, 1972; Ruder & Bunce, 1981).

Metaphon Therapy

Metaphon therapy is designed to teach metaphonological awareness —that is, the awareness of the phonological structure of language. This approach assumes that children with phonological disorders have failed to acquire the rules of the phonological system.

The focus is on sound properties that need to be contrasted. For example, for problems with voicing, the concept of "noisy" (voiced) versus "quiet" (voiceless) is taught. Targets typically include processes that affect intelligibility, can be imitated, or are not seen in typically developing children of the same age (Dean, Howell, Waters, & Reid, 1995; Howell & Dean, 1994).

Naturalistic Speech Intelligibility Intervention

Naturalist speech intelligibility intervention addresses the targeted sound in naturalistic activities that provide the child with frequent opportunities for the sound to occur. For example, using a McDonald's menu, signs at the grocery store, or favorite books, the child can be asked questions about words that contain the targeted sound(s). The child's error productions are recast without the use of imitative prompts or direct motor training. This approach is used with children who are able to use the recasts effectively (Camarata, 2010).

Nonspeech Oral–Motor Therapy

Nonspeech oral–motor therapy involves the use of oral-motor training prior to teaching sounds or as a supplement to speech sound instruction. The rationale behind this approach is that (a) immature or deficient oral-motor control or strength may be causing poor articulation and (b) it is necessary to teach control of the articulators before working on correct production of sounds. Consult systematic reviews of this treatment to help guide clinical decision making (see, e.g., Lee & Gibbon, 2015 [PDF]; McCauley, Strand, Lof, Schooling, & Frymark, 2009 ). See also the Treatment section of the Speech Sound Disorders Evidence Map filtered for Oral–Motor Exercises .

Speech Sound Perception Training

Speech sound perception training is used to help a child acquire a stable perceptual representation for the target phoneme or phonological structure. The goal is to ensure that the child is attending to the appropriate acoustic cues and weighting them according to a language-specific strategy (i.e., one that ensures reliable perception of the target in a variety of listening contexts).

Recommended procedures include (a) auditory bombardment in which many and varied target exemplars are presented to the child, sometimes in a meaningful context such as a story and often with amplification, and (b) identification tasks in which the child identifies correct and incorrect versions of the target (e.g., "rat" is a correct exemplar of the word corresponding to a rodent, whereas "wat" is not).

Tasks typically progress from the child judging speech produced by others to the child judging the accuracy of his or her own speech. Speech sound perception training is often used before and/or in conjunction with speech production training approaches. See Rvachew, 1994; Rvachew et al., 2004; Rvachew, Rafaat, & Martin, 1999; Wolfe, Presley, & Mesaris, 2003.

Traditionally, the speech stimuli used in these tasks are presented via live voice by the SLP. More recently, computer technology has been used—an advantage of this approach is that it allows for the presentation of more varied stimuli representing, for example, multiple voices and a range of error types.

Treatment Techniques and Technologies

Techniques used in therapy to increase awareness of the target sound and/or provide feedback about placement and movement of the articulators include the following:

  • Using a mirror for visual feedback of place and movement of articulators
  • Using gestural cueing for place or manner of production (e.g., using a long, sweeping hand gesture for fricatives vs. a short, "chopping" gesture for stops)
  • Using ultrasound imaging (placement of an ultrasound transducer under the chin) as a biofeedback technique to visualize tongue position and configuration (Adler-Bock, Bernhardt, Gick, & Bacsfalvi, 2007; Lee, Wrench, & Sancibrian, 2015; Preston, Brick, & Landi, 2013; Preston et al., 2014)
  • Using palatography (various coloring agents or a palatal device with electrodes) to record and visualize contact of the tongue on the palate while the child makes different speech sounds (Dagenais, 1995; Gibbon, Stewart, Hardcastle, & Crampin, 1999; Hitchcock, McAllister Byun, Swartz, & Lazarus, 2017)
  • Amplifying target sounds to improve attention, reduce distractibility, and increase sound awareness and discrimination—for example, auditory bombardment with low-level amplification is used with the cycles approach at the beginning and end of each session to help children perceive differences between errors and target sounds (Hodson, 2010)
  • Providing spectral biofeedback through a visual representation of the acoustic signal of speech (McAllister Byun & Hitchcock, 2012)
  • Providing tactile biofeedback using tools, devices, or substances placed within the mouth (e.g., tongue depressors, peanut butter) to provide feedback on correct tongue placement and coordination (Altshuler, 1961; Leonti, Blakeley, & Louis, 1975; Shriberg, 1980)

Considerations for Treating Bilingual/Multilingual Populations

When treating a bilingual or multilingual individual with a speech sound disorder, the clinician is working with two or more different sound systems. Although there may be some overlap in the phonemic inventories of each language, there will be some sounds unique to each language and different phonemic rules for each language.

One linguistic sound system may influence production of the other sound system. It is the role of the SLP to determine whether any observed differences are due to a true communication disorder or whether these differences represent variations of speech associated with another language that a child speaks.

Strategies used when designing a treatment protocol include

  • determining whether to use a bilingual or cross-linguistic approach (see ASHA's Practice Portal page on Multilingual Service Delivery in Audiology and Speech-Language Pathology );
  • determining the language in which to provide services, on the basis of factors such as language history, language use, and communicative needs;
  • identifying alternative means of providing accurate models for target phonemes that are unique to the child's language, when the clinician is unable to do so; and
  • noting if success generalizes across languages throughout the treatment process (Goldstein & Fabiano, 2007).

Considerations for Treatment in Schools

Criteria for determining eligibility for services in a school setting are detailed in the Individuals with Disabilities Education Improvement Act of 2004 (IDEA). In accordance with these criteria, the SLP needs to determine

  • if the child has a speech sound disorder;
  • if there is an adverse effect on educational performance resulting from the disability; and
  • if specially designed instruction and/or related services and supports are needed to help the student make progress in the general education curriculum.

Examples of the adverse effect on educational performance include the following:

  • The speech sound disorder affects the child's ability or willingness to communicate in the classroom (e.g., when responding to teachers' questions; during classroom discussions or oral presentations) and in social settings with peers (e.g., interactions during lunch, recess, physical education, and extracurricular activities).
  • The speech sound disorder signals problems with phonological skills that affect spelling, reading, and writing. For example, the way a child spells a word reflects the errors made when the word is spoken. See ASHA's resource language in brief and ASHA's Practice Portal pages on Spoken Language Disorders and Written Language Disorders for more information about the relationship between spoken and written language

Eligibility for speech-language pathology services is documented in the child's individualized education program, and the child's goals and the dismissal process are explained to parents and teachers. For more information about eligibility for services in the schools, see ASHA's resources on eligibility and dismissal in schools , IDEA Part B Issue Brief: Individualized Education Programs and Eligibility for Services , and 2011 IDEA Part C Final Regulations .

If a child is not eligible for services under IDEA, they may still be eligible to receive services under the Rehabilitation Act of 1973, Section 504. 29 U.S.C. § 701 (1973) . See ASHA's Practice Portal page on Documentation in Schools for more information about Section 504 of the Rehabilitation Act of 1973.

Dismissal from speech-language pathology services occurs once eligibility criteria are no longer met—that is, when the child's communication problem no longer adversely affects academic achievement and functional performance.

Children With Persisting Speech Difficulties

Speech difficulties sometimes persist throughout the school years and into adulthood. Pascoe et al. (2006) define persisting speech difficulties as "difficulties in the normal development of speech that do not resolve as the child matures or even after they receive specific help for these problems" (p. 2). The population of children with persistent speech difficulties is heterogeneous, varying in etiology, severity, and nature of speech difficulties (Dodd, 2005; Shriberg et al., 2010; Stackhouse, 2006; Wren, Roulstone, & Miller, 2012).

A child with persisting speech difficulties (functional speech sound disorders) may be at risk for

  • difficulty communicating effectively when speaking;
  • difficulty acquiring reading and writing skills; and
  • psychosocial problems (e.g., low self-esteem, increased risk of bullying; see, e.g., McCormack, McAllister, McLeod, & Harrison, 2012).

Intervention approaches vary and may depend on the child's area(s) of difficulty (e.g., spoken language, written language, and/or psychosocial issues).

In designing an effective treatment protocol, the SLP considers

  • teaching and encouraging the use of self-monitoring strategies to facilitate consistent use of learned skills;
  • collaborating with teachers and other school personnel to support the child and to facilitate his or her access to the academic curriculum; and
  • managing psychosocial factors, including self-esteem issues and bullying (Pascoe et al., 2006).

Transition Planning

Children with persisting speech difficulties may continue to have problems with oral communication, reading and writing, and social aspects of life as they transition to post-secondary education and vocational settings (see, e.g., Carrigg, Baker, Parry, & Ballard, 2015). The potential impact of persisting speech difficulties highlights the need for continued support to facilitate a successful transition to young adulthood. These supports include the following:

  • Transition Planning —the development of a formal transition plan in middle or high school that includes discussion of the need for continued therapy, if appropriate, and supports that might be needed in postsecondary educational and/or vocational settings (IDEA, 2004).
  • Disability Support Services —individualized support for postsecondary students that may include extended time for tests, accommodations for oral speaking assignments, the use of assistive technology (e.g., to help with reading and writing tasks), and the use of methods and devices to augment oral communication, if necessary.

The Americans with Disabilities Act of 1990 (ADA) and Section 504 of the Rehabilitation Act of 1973 provide protections for students with disabilities who are transitioning to postsecondary education. The protections provided by these acts (a) ensure that programs are accessible to these students and (b) provide aids and services necessary for effective communication (U.S. Department of Education, Office for Civil Rights, 2011).

For more information about transition planning, see ASHA's resource on Postsecondary Transition Planning .

Service Delivery

See the Service Delivery section of the Speech Sound Disorders Evidence Map for pertinent scientific evidence, expert opinion, and client/caregiver perspective.

In addition to determining the type of speech and language treatment that is optimal for children with speech sound disorders, SLPs consider the following other service delivery variables that may have an impact on treatment outcomes:

  • Dosage —the frequency, intensity, and duration of service
  • Format —whether a person is seen for treatment one-on-one (i.e., individual) or as part of a group
  • Provider —the person administering the treatment (e.g., SLP, trained volunteer, caregiver)
  • Setting —the location of treatment (e.g. home, community-based, school [pull-out or within the classroom])
  • Timing —when intervention occurs relative to the diagnosis.

Technology can be incorporated into the delivery of services for speech sound disorders, including the use of telepractice as a format for delivering face-to-face services remotely. See ASHA's Practice Portal page on Telepractice .

The combination of service delivery factors is important to consider so that children receive optimal intervention intensity to ensure that efficient, effective change occurs (Baker, 2012; Williams, 2012).

ASHA Resources

  • Consumer Information: Speech Sound Disorders
  • Interprofessional Education/Interprofessional Practice (IPE/IPP)
  • Let's Talk: For People With Special Communication Needs
  • Person- and Family-Centered Care
  • Person-Centered Focus on Function: Speech Sound Disorder [PDF]
  • Phonemic Inventories and Cultural and Linguistic Information Across Languages
  • Postsecondary Transition Planning
  • Selected Phonological Processes (Patterns)

Other Resources

  • Age of Acquisition of English Consonants (Crowe & McLeod, 2020) [PDF]
  • American Cleft Palate–Craniofacial Association
  • English Consonant and Vowel Charts (University of Arizona)
  • Everyone Has an Accent
  • Free Resources for the Multiple Oppositions approach - Adventures in Speech Pathology
  • Multilingual Children's Speech: Overview
  • Multilingual Children's Speech: Intelligibility in Context Scale
  • Multilingual Children's Speech: Speech Participation and Activity Assessment of Children (SPAA-C)
  • Phonetics: The Sounds of American English (University of Iowa)
  • Phonological and Phonemic Awareness
  • Place, Manner and Voicing Chart for English Consonants (Roth & Worthington, 2018)
  • RCSLT: New Long COVID Guidance and Patient Handbook
  • The Development of Phonological Skills (WETA Educational Website)
  • The Speech Accent Archive (George Mason University)

Adler-Bock, M., Bernhardt, B. M., Gick, B., & Bacsfalvi, P. (2007). The use of ultrasound in remediation of North American English /r/ in 2 adolescents. American Journal of Speech-Language Pathology, 16, 128–139.

Altshuler, M. W. (1961). A therapeutic oral device for lateral emission. Journal of Speech and Hearing Disorders, 26, 179–181.

American Speech-Language-Hearing Association. (2016a). Code of ethics [Ethics]. Available from www.asha.org/policy/

American Speech-Language-Hearing Association. (20016b). Scope of practice in speech-language-pathology [Scope of Practice]. Available from www.asha.org/policy/

Americans with Disabilities Act of 1990, P.L. 101-336, 42 U.S.C. §§ 12101 et seq.

Anthony, J. L., Aghara, R. G., Dunkelberger, M. J., Anthony, T. I., Williams, J. M., & Zhang, Z. (2011). What factors place children with speech sound disorders at risk for reading problems? American Journal of Speech-Language Pathology, 20, 146–160.

Baker, E. (2012). Optimal intervention intensity. International Journal of Speech-Language Pathology, 14, 401–409.

Baker, E., & Williams, A. L. (2010). Complexity approaches to intervention. In S. F. Warren & M. E. Fey (Series Eds.). & A. L. Williams, S. McLeod, & R. J. McCauley (Volume Eds.), Intervention for speech sound disorders in children (pp. 95–115). Baltimore, MD: Brookes.

Bernthal, J., Bankson, N. W., & Flipsen, P., Jr. (2017). Articulation and phonological disorders: Speech sound disorders in children . New York, NY: Pearson.

Blache, S., & Parsons, C. (1980). A linguistic approach to distinctive feature training. Language, Speech, and Hearing Services in Schools, 11, 203–207.

Blache, S. E., Parsons, C. L., & Humphreys, J. M. (1981). A minimal-word-pair model for teaching the linguistic significant difference of distinctive feature properties. Journal of Speech and Hearing Disorders, 46, 291–296.

Black, L. I., Vahratian, A., & Hoffman, H. J. (2015). Communication disorders and use of intervention services among children aged 3–17 years; United States, 2012 (NHS Data Brief No. 205). Hyattsville, MD: National Center for Health Statistics.

Bleile, K. (2002). Evaluating articulation and phonological disorders when the clock is running. American Journal of Speech-Language Pathology, 11, 243–249.

Byers Brown, B., Bendersky, M., & Chapman, T. (1986). The early utterances of preterm infants. British Journal of Communication Disorders, 21, 307–320.

Camarata, S. (2010). Naturalistic intervention for speech intelligibility and speech accuracy. In A. L. Williams, S. McLeod, & R. J. McCauley (Eds.), Interventions for speech sound disorders in children (pp. 381–406). Baltimore, MD: Brookes.

Campbell, T. F., Dollaghan, C. A., Rockette, H. E., Paradise, J. L., Feldman, H. M., Shriberg, L. D., . . . Kurs-Lasky, M. (2003). Risk factors for speech delay of unknown origin in 3-year-old children. Child Development, 74, 346–357.

Carrigg, B., Baker, E., Parry, L., & Ballard, K. J. (2015). Persistent speech sound disorder in a 22-year-old male: Communication, educational, socio-emotional, and vocational outcomes. Perspectives on School-Based Issues, 16, 37–49.

Carroll, J. M., Snowling, M. J., Stevenson, J., & Hulme, C. (2003). The development of phonological awareness in preschool children. Developmental Psychology, 39, 913–923.

Catts, H. W., McIlraith, A., Bridges, M. S., & Nielsen, D. C. (2017). Viewing a phonological deficit within a multifactorial model of dyslexia. Reading and Writing, 30, 613–629.

Coplan, J., & Gleason, J. R. (1988). Unclear speech: Recognition and significance of unintelligible speech in preschool children. Pediatrics, 82, 447–452.

Crowe, K., & McLeod, S. (2020). Children’s English consonant acquisition in the United States: A review. American Journal of Speech-Language Pathology, 29 (4), 2155-2169. https://doi.org/10.1044/2020_AJSLP-19-00168.

Dagenais, P. A. (1995). Electropalatography in the treatment of articulation/phonological disorders. Journal of Communication Disorders, 28, 303–329.

Dean, E., Howell, J., Waters, D., & Reid, J. (1995). Metaphon: A metalinguistic approach to the treatment of phonological disorder in children. Clinical Linguistics & Phonetics, 9, 1–19.

Dodd, B. (2005). Differential diagnosis and treatment of children with speech disorder . London, England: Whurr.

Dodd, B., Holm, A., Crosbie, S., & McIntosh, B. (2006). A core vocabulary approach for management of inconsistent speech disorder. International Journal of Speech-Language Pathology, 8, 220–230.

Eadie, P., Morgan, A., Ukoumunne, O. C., Eecen, K. T., Wake, M., & Reilly, S. (2015). Speech sound disorder at 4 years: Prevalence, comorbidities, and predictors in a community cohort of children. Developmental Medicine & Child Neurology, 57, 578–584.

Elbert, M., & McReynolds, L. V. (1978). An experimental analysis of misarticulating children's generalization. Journal of Speech and Hearing Research, 21, 136–149.

Ertmer, D. J. (2010). Relationship between speech intelligibility and word articulation scores in children with hearing loss. Journal of Speech, Language, and Hearing Research, 53, 1075–1086.

Everhart, R. (1960). Literature survey of growth and developmental factors in articulation maturation. Journal of Speech and Hearing Disorders, 25, 59–69.

Fabiano-Smith, L., & Goldstein, B. A. (2010). Phonological acquisition in bilingual Spanish–English speaking children. Journal of Speech, Language, and Hearing Research, 53, 1 60–178.

Felsenfeld, S., McGue, M., & Broen, P. A. (1995). Familial aggregation of phonological disorders: Results from a 28-year follow-up. Journal of Speech, Language, and Hearing Research, 38, 1091–1107.

Fey, M. (1986). Language intervention with young children . Boston, MA: Allyn & Bacon.

Flipsen, P. (2006). Measuring the intelligibility of conversational speech in children. Clinical Linguistics & Phonetics, 20, 202–312.

Flipsen, P. (2015). Emergence and prevalence of persistent and residual speech errors. Seminars in Speech Language, 36, 217–223.

Fox, A. V., Dodd, B., & Howard, D. (2002). Risk factors for speech disorders in children. International Journal of Language and Communication Disorders, 37, 117–132.

Gibbon, F., Stewart, F., Hardcastle, W. J., & Crampin, L. (1999). Widening access to electropalatography for children with persistent sound system disorders. American Journal of Speech-Language Pathology, 8, 319–333.

Gierut, J. A. (1989). Maximal opposition approach to phonological treatment. Journal of Speech and Hearing Research, 54, 9–19.

Gierut, J. A. (1990). Differential learning of phonological oppositions. Journal of Speech and Hearing Research, 33, 540–549.

Gierut, J. A. (1992). The conditions and course of clinically induced phonological change. Journal of Speech and Hearing Research, 35, 1049–1063.

Gierut, J. A. (2007). Phonological complexity and language learnability. American Journal of Speech-Language Pathology, 16, 6–17.

Glaspey, A. M., & Stoel-Gammon, C. (2007). A dynamic approach to phonological assessment. Advances in Speech-Language Pathology, 9, 286–296.

Goldstein, B. A., & Fabiano, L. (2007, February 13). Assessment and intervention for bilingual children with phonological disorders. The ASHA Leader, 12, 6–7, 26–27, 31.

Grunwell, P. (1987). Clinical phonology (2nd ed.). London, England: Chapman and Hall.

Hitchcock, E. R., McAllister Byun, T., Swartz, M., & Lazarus, R. (2017). Efficacy of electropalatography for treating misarticulations of /r/. American Journal of Speech-Language Pathology, 26, 1141–1158.

Hodson, B. (2010). Evaluating and enhancing children's phonological systems: Research and theory to practice. Wichita, KS: PhonoComp.

Howell, J., & Dean, E. (1994). Treating phonological disorders in children: Metaphon—Theory to practice (2nd ed.). London, England: Whurr.

Individuals with Disabilities Education Improvement Act of 2004, P. L. 108-446, 20 U.S.C. §§ 1400 et seq. Retrieved from http://idea.ed.gov/

Kent, R. D., Miolo, G., & Bloedel, S. (1994). The intelligibility of children's speech: A review of evaluation procedures. American Journal of Speech-Language Pathology, 3, 81–95.

Law, J., Boyle, J., Harris, F., Harkness, A., & Nye, C. (2000). Prevalence and natural history of primary speech and language delay: Findings from a systematic review of the literature. International Journal of Language and Communication Disorders, 35, 165–188.

Lee, A. S. Y., & Gibbon, F. E. (2015). Non-speech oral motor treatment for children with developmental speech sound disorders. Cochrane Database of Systematic Reviews, 2015 (3), 1–42.

Lee, S. A. S., Wrench, A., & Sancibrian, S. (2015). How to get started with ultrasound technology for treatment of speech sound disorders. Perspectives on Speech Science and Orofacial Disorders, 25, 66–80.

Leitão, S., & Fletcher, J. (2004). Literacy outcomes for students with speech impairment: Long-term follow-up. International Journal of Language and Communication Disorders, 39, 245–256.

Leonti, S., Blakeley, R., & Louis, H. (1975, November). Spontaneous correction of resistant /ɚ/ and /r/ using an oral prosthesis. Paper presented at the annual convention of the American Speech and Hearing Association, Washington, DC.

Lewis, B. A., Avrich, A. A., Freebairn, L. A., Hansen, A. J., Sucheston, L. E., Kuo, I., . . . Stein, C. M. (2011). Literacy outcomes of children with early childhood speech sound disorders: Impact of endophenotypes. Journal of Speech, Language, and Hearing Research, 54, 1628–1643.

Locke, J. (1980). The inference of speech perception in the phonologically disordered child. Part I: A rationale, some criteria, the conventional tests. Journal of Speech and Hearing Disorders, 45, 431–444.

McAllister Byun, T., & Hitchcock, E. R. (2012). Investigating the use of traditional and spectral biofeedback approaches to intervention for /r/ misarticulation. American Journal of Speech-Language Pathology, 21, 207–221.

McCauley, R. J., Strand, E., Lof, G. L., Schooling, T., & Frymark, T. (2009). Evidence-based systematic review: Effects of nonspeech oral motor exercises on speech. American Journal of Speech-Language Pathology, 18, 343–360.

McCormack, J., McAllister, L., McLeod, S., & Harrison, L. (2012). Knowing, having, doing: The battles of childhood speech impairment. Child Language Teaching and Therapy, 28, 141–157.

McDonald, E. T. (1974). Articulation testing and treatment: A sensory motor approach. Pittsburgh, PA: Stanwix House.

McLeod, S., & Crowe, K. (2018). Children's consonant acquisition in 27 languages: A cross-linguistic review. American Journal of Speech-Language Pathology, 27, 1546–1571.

McLeod, S., Verdon, S., & The International Expert Panel on Multilingual Children's Speech. (2017). Tutorial: Speech assessment for multilingual children who do not speak the same language(s) as the speech-language pathologist. American Journal of Speech-Language Pathology, 26, 691–708.

McNeill, B. C., & Hesketh, A. (2010). Developmental complexity of the stimuli included in mispronunciation detection tasks. International Journal of Language & Communication Disorders, 45, 72–82.

McReynolds, L. V., & Bennett, S. (1972). Distinctive feature generalization in articulation training. Journal of Speech and Hearing Disorders, 37, 462–470.

Morley, D. (1952). A ten-year survey of speech disorders among university students. Journal of Speech and Hearing Disorders, 17, 25–31.

National Institute for Literacy. (2009). Developing early literacy: Report of the National Early Literacy Panel. A scientific synthesis of early literacy development and implications for intervention. Washington, DC: U.S. Department of Education.

National Institute of Child Health and Human Development. (2000). Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction [Report of the National Reading Panel]. Washington, DC: Author.

Overby, M.S., Trainin, G., Smit, A. B., Bernthal, J. E., & Nelson, R. (2012). Preliteracy speech sound production skill and later literacy outcomes: A study using the Templin Archive. Language, Speech, and Hearing Services in Schools, 43, 97–115.

Pascoe, M., Stackhouse, J., & Wells, B. (2006). Persisting speech difficulties in children: Children's speech and literacy difficulties, Book 3. West Sussex, England: Whurr.

Peña-Brooks, A., & Hegde, M. N. (2015). Assessment and treatment of articulation and phonological disorders in children . Austin, TX: Pro-Ed.

Peterson, R. L., Pennington, B. F., Shriberg, L. D., & Boada, R. (2009). What influences literacy outcome in children with speech sound disorder? Journal of Speech, Language, and Hearing Research, 52, 1175-1188.

Powell, T. W., & Miccio, A. W. (1996). Stimulability: A useful clinical tool. Journal of Communication Disorders, 29, 237–253.

Preston, J. L., Brick, N., & Landi, N. (2013). Ultrasound biofeedback treatment for persisting childhood apraxia of speech. American Journal of Speech-Language Pathology, 22, 627–643.

Preston, J. L., McCabe, P., Rivera-Campos, A., Whittle, J. L., Landry, E., & Maas, E. (2014). Ultrasound visual feedback treatment and practice variability for residual speech sound errors. Journal of Speech, Language, and Hearing Research, 57, 2102–2115.

Prezas, R. F., & Hodson, B. W. (2010). The cycles phonological remediation approach. In A. L. Williams, S. McLeod, & R. J. McCauley (Eds.), Interventions for speech sound disorders in children (pp. 137–158). Baltimore, MD: Brookes.

Section 504 of the Rehabilitation Act of 1973, P.L. No. 93-112, 29 U.S.C. § 794.

Roth, F. P., & Worthington, C. K. (2018). Treatment resource manual for speech-language pathology . San Diego, CA: Plural Publishing.

Ruder, K. F., & Bunce, B. H. (1981). Articulation therapy using distinctive feature analysis to structure the training program: Two case studies. Journal of Speech and Hearing Disorders, 46, 59–65.

Rvachew, S. (1994). Speech perception training can facilitate sound production learning. Journal of Speech, Language, and Hearing Research, 37, 347–357.

Rvachew, S., & Bernhardt, B. M. (2010). Clinical implications of dynamic systems theory for phonological development. American Journal of Speech-Language Pathology, 19, 34–50.

Rvachew, S., Nowak, M., & Cloutier, G. (2004). Effect of phonemic perception training on the speech production and phonological awareness skills of children with expressive phonological delay. American Journal of Speech-Language Pathology, 13, 250–263.

Rvachew, S., Rafaat, S., & Martin, M. (1999). Stimulability, speech perception skills, and treatment of phonological disorders. American Journal of Speech-Language Pathology, 8, 33–43.

Shriberg, L. D. (1980). An intervention procedure for children with persistent /r/ errors. Language, Speech, and Hearing Services in Schools, 11, 102–110.

Shriberg, L. D., & Austin, D. (1998). Comorbidity of speech-language disorders: Implications for a phenotype marker for speech delay. In R. Paul (Ed.), The speech-language connection (pp. 73–117). Baltimore, MD: Brookes.

Shriberg, L. D., Austin, D., Lewis, B., McSweeny, J. L., & Wilson, D. L. (1997). The percentage of consonants correct (PCC) metric: Extensions and reliability data. Journal of Speech, Language, and Hearing Research, 40, 708–722.

Shriberg, L. D., Fourakis, M., Hall, S. D., Karlsson, H. B., Lohmeier, H. L., McSweeny, J. L., . . . Wilson, D. L. (2010). Extensions to the speech disorders classification system (SDCS). Clinical Linguistics & Phonetics, 24, 795–824.

Shriberg, L. D., & Kwiatkowski, J. (1980). Natural Process Analysis (NPA): A procedure for phonological analysis of continuous speech samples. New York, NY: Macmillan.

Shriberg, L. D., & Kwiatkowski, J. (1982a). Phonological disorders II: A conceptual framework for management. Journal of Speech and Hearing Disorders, 47, 242–256.

Shriberg, L. D., & Kwiatkowski, J. (1982b). Phonological disorders III: A procedure for assessing severity of involvement. Journal of Speech and Hearing Disorders, 47, 256–270.

Shriberg, L. D., & Kwiatkowski, J. (1994). Developmental phonological disorders I: A clinical profile. Journal of Speech and Hearing Research, 37, 1100–1126.

Shriberg, L. D., Tomblin, J. B., & McSweeny, J. L. (1999). Prevalence of speech delay in 6-year-old children and comorbidity with language impairment. Journal of Speech, Language, and Hearing Research, 42, 1461–1481.

Silva, P. A., Chalmers, D., & Stewart, I. (1986). Some audiological, psychological, educational and behavioral characteristics of children with bilateral otitis media with effusion. Journal of Learning Disabilities, 19, 165–169.

Stackhouse, J. (2006). Speech and spelling difficulties: Who is at risk and why? In M. Snowling & J. Stackhouse (Eds.), Dyslexia speech and language: A practitioner's handbook (pp. 15–35). West Sussex, England: Whurr.

Storkel, H. L. (2018). The complexity approach to phonological treatment: How to select treatment targets. Language, Speech, and Hearing Sciences in Schools, 49, 463–481.

Storkel, Holly. (2019). Using Developmental Norms for Speech Sounds as a Means of Determining Treatment Eligibility in Schools. Perspectives of the ASHA Special Interest Groups, 4, 67-75.

Teele, D. W., Klein, J. O., Chase, C., Menyuk, P., & Rosner, B. A. (1990). Otitis media in infancy and intellectual ability, school achievement, speech, and language at 7 years. Journal of Infectious Disease, 162, 685–694.

Tyler, A. A., & Tolbert, L. C. (2002). Speech-language assessment in the clinical setting. American Journal of Speech-Language Pathology, 11, 215–220.

U.S. Department of Education, Office for Civil Rights. (2011). Transition of students with disabilities to postsecondary education: A guide for high school educators . Washington, DC: Author. Retrieved from http://www2.ed.gov/about/offices/list/ocr/transitionguide.html

Wagner, R. K., & Torgesen, J. K. (1987). The nature of phonological processing and its causal role in the acquisition of reading skills. Psychological Bulletin, 101, 192–212.

Weiner, F. (1981). Treatment of phonological disability using the method of meaningful minimal contrast: Two case studies. Journal of Speech and Hearing Disorders, 46, 97–103.

Williams, A. L. (2000a). Multiple oppositions: Case studies of variables in phonological intervention. American Journal of Speech-Language Pathology, 9, 289–299.

Williams, A. L. (2000b). Multiple oppositions: Theoretical foundations for an alternative contrastive intervention approach. American Journal of Speech-Language Pathology, 9, 282–288.

Williams, A. L. (2003a). Speech disorders resource guide for preschool children. Clifton Park, NY: Delmar Learning.

Williams, A. L. (2003b). Target selection and treatment outcomes. Perspectives on Language Learning and Education, 10, 12–16.

Williams, A. L. (2012). Intensity in phonological intervention: Is there a prescribed amount? International Journal of Speech-Language Pathology, 14, 456–461.

Williams, A. L., McLeod, S., & McCauley, R. J. (2010). Direct speech production intervention. In A. L. Williams, S. McLeod, & R. J. McCauley (Eds.), Interventions for speech sound disorders in children (pp. 27–39). Baltimore, MD: Brookes.

Wolfe, V., Presley, C., & Mesaris, J. (2003). The importance of sound identification training in phonological intervention. American Journal of Speech-Language Pathology, 12, 282–288.

World Health Organization. (2001). International classification of functioning, disability and health. Geneva, Switzerland: Author.

Wren, Y., Miller, L. L., Peters, T. J., Emond, A., & Roulstone, S. (2016). Prevalence and predictors of persistent speech sound disorder at eight years old: Findings from a population cohort study. Journal of Speech, Language, and Hearing Research , 59, 647–673.

Wren, Y. E., Roulstone, S. E., & Miller, L. L. (2012). Distinguishing groups of children with persistent speech disorder: Findings from a prospective population study. Logopedics Phoniatrics Vocology, 37, 1–10.

About This Content

Acknowledgements .

Content for ASHA's Practice Portal is developed through a comprehensive process that includes multiple rounds of subject matter expert input and review. ASHA extends its gratitude to the following subject matter experts who were involved in the development of the Speech Sound Disorders:  Articulation and Phonology page:

  • Elise M. Baker, PhD
  • John E. Bernthal, PhD, CCC-A/SLP
  • Caroline Bowen, PhD
  • Cynthia W. Core, PhD, CCC-SLP
  • Sharon B. Hart, PhD, CCC-SLP
  • Barbara W. Hodson, PhD, CCC-SLP
  • Sharynne McLeod, PhD
  • Susan Rvachew, PhD, S-LP(C)
  • Cheryl C. Sancibrian, MS, CCC-SLP
  • Holly L. Storkel, PhD, CCC-SLP
  • Judith E. Trost-Cardamone, PhD, CCC-SLP
  • Lynn Williams, PhD, CCC-SLP

Citing Practice Portal Pages 

The recommended citation for this Practice Portal page is:

American Speech-Language-Hearing Association (n.d.) Speech Sound Disorders: Articulation and Phonology. (Practice Portal). Retrieved month, day, year, from www.asha.org/Practice-Portal/Clinical-Topics/Articulation-and-Phonology/ .

Practice Portal logo

In This Section

  • Practice Portal Home
  • Clinical Topics
  • Professional Issues
  • Advertising Disclaimer
  • Advertise with us

Evidence Maps

  • ASHA Evidence Maps

Peer Connections

  • Connect with your colleagues in the ASHA Community
  • ASHA Special Interest Groups

ASHA Related Content

  • Find related products in ASHA's Store
  • Search for articles on ASHAWire
  • ASHA Stream

Content Disclaimer: The Practice Portal, ASHA policy documents, and guidelines contain information for use in all settings; however, members must consider all applicable local, state and federal requirements when applying the information in their specific work setting.

ASHA Corporate Partners

  • Become A Corporate Partner

Stepping Stones Group

The American Speech-Language-Hearing Association (ASHA) is the national professional, scientific, and credentialing association for 234,000 members, certificate holders, and affiliates who are audiologists; speech-language pathologists; speech, language, and hearing scientists; audiology and speech-language pathology assistants; and students.

  • All ASHA Websites
  • Work at ASHA
  • Marketing Solutions

Information For

Get involved.

  • ASHA Community
  • Become a Mentor
  • Become a Volunteer
  • Special Interest Groups (SIGs)

Connect With ASHA

American Speech-Language-Hearing Association 2200 Research Blvd., Rockville, MD 20850 Members: 800-498-2071 Non-Member: 800-638-8255

MORE WAYS TO CONNECT

Media Resources

  • Press Queries

Site Help | A–Z Topic Index | Privacy Statement | Terms of Use © 1997- American Speech-Language-Hearing Association

| | |
   

Computer Science > Sound

Title: investigating disentanglement in a phoneme-level speech codec for prosody modeling.

Abstract: Most of the prevalent approaches in speech prosody modeling rely on learning global style representations in a continuous latent space which encode and transfer the attributes of reference speech. However, recent work on neural codecs which are based on Residual Vector Quantization (RVQ) already shows great potential offering distinct advantages. We investigate the prosody modeling capabilities of the discrete space of such an RVQ-VAE model, modifying it to operate on the phoneme-level. We condition both the encoder and decoder of the model on linguistic representations and apply a global speaker embedding in order to factor out both phonetic and speaker information. We conduct an extensive set of investigations based on subjective experiments and objective measures to show that the phoneme-level discrete latent representations obtained this way achieves a high degree of disentanglement, capturing fine-grained prosodic information that is robust and transferable. The latent space turns out to have interpretable structure with its principal components corresponding to pitch and energy.
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as: [cs.SD]
  (or [cs.SD] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. 1 Functional aspect of speech sounds 2 Modification

    modification of phonemes in speech

  2. 1 Functional aspect of speech sounds 2 Modification

    modification of phonemes in speech

  3. 1 Functional aspect of speech sounds 2 Modification

    modification of phonemes in speech

  4. 1 Functional aspect of speech sounds 2 Modification

    modification of phonemes in speech

  5. 1 Functional aspect of speech sounds 2 Modification

    modification of phonemes in speech

  6. Aspects of connected speech Modification of phonemes consonants

    modification of phonemes in speech

COMMENTS

  1. Accent Modification

    Accent modification is an elective service sought by individuals who want to change or modify their speech. Accents are systematic variations in the execution of speech characterized by differences in phonological and/or prosodic features that are perceived as different from any native, standard, regional, or dialectal form of speech (Valles ...

  2. Phonological Awareness and Speech Sound Disorders: What Every SLP

    Gillon G. T. (2005). Facilitating phoneme awareness development in 3- and 4-year-old children with speech impairment. Language, speech, and hearing services in schools, 36(4), 308-324. Hesketh, A., Dima, E., & Nelson, V. (2007). Teaching phoneme awareness to pre-literate children with speech disorder: a randomized controlled trial.

  3. PDF Treatment Strategies for Accent Modification

    Goal of Accent Modification. To communicate effectively. Intelligible. Natural (following general patterns of native speech and pragmatics) NOT to make them indistinguishable from native speakers - "losing" an accent entirely is neither realistic or necessary. Accent does not equal Disorder.

  4. 14.3 Phonological change

    3.2 Speech articulators. 3.3 Describing consonants: Place and phonation. 3.4 Describing consonants: Manner. 3.5 Describing vowels. 3.6 The International Phonetic Alphabet. ... If a regular sound change decreases the number of phonemes in a language, it is called a phonemic merger or simply merger. The [y] > [i] change was a merger, since /y ...

  5. Phonological Change

    As phonemes are defined contrastively and function to distinguish meaning, the phonological system may react when phonetic change threatens existing distinctions. This may stop change from happening in the first place, or it may set in motion a chain of related sound changes. The Great Vowel Shift is an example of such a chain shift, and we ...

  6. Phonological change

    In historical linguistics, phonological change is any sound change that alters the distribution of phonemes in a language. In other words, a language develops a new system of oppositions among its phonemes. Old contrasts may disappear, new ones may emerge, or they may simply be rearranged. [1] Sound change may be an impetus for changes in the phonological structures of a language (and likewise ...

  7. Speech modifications in interactive speech: effects of age, sex and

    It has been shown that these speech modifications are modulated by complex interactions between various talker-related (e.g. age, regional accent), listener-related (e.g. hearing acuity) and environment-related factors (e.g. room acoustics, background noise; see for a review). In this study, we focus on a few of these factors, namely, how ...

  8. 4.1 Phonemes and Contrast

    4.1 Phonemes and Contrast. Within a given language, some sounds might have slight phonetic differences from each other but still be treated as the same sound by the mental grammar of that language. A phoneme is a mental category of sounds that includes some variation within the category. The mental grammar ignores that variation and treats all ...

  9. 2.2 The Articulatory System

    There are two basic categories of sound that can be classified in terms of the way in which the flow of air through the vocal tract is modified. Phonemes that are produced without any obstruction to the flow of air are called vowels. Phonemes that are produced with some kind of modification to the airflow are called consonants. Of course ...

  10. PDF Phonetics and Sound Change

    Phonetics and Sound Change . Phonetic considerations have long been hypothesized to play a central role in accounting for the nature of sound change. . The Neogrammarian hypothesis: sound change is exceptionless and purely phonetically conditioned. . 'sounds change not words'. . Suggests that the mechanisms of sound change involve phonetics ...

  11. Changing Words and Sounds: The Roles of ...

    1.2 Phonemes, contexts, and words in sound change 1.2.1 /u/-fronting. Fig. 1 provides a visual summary of the intersecting levels of representation that are particularly relevant to /u/-fronting. The outermost solid box represents the phoneme /u/. The figure shows two contextual realizations of /u/ (dashed boxes): /u/ preceded by /j/ ([ju]) and /u/ in other contexts ([u]).

  12. Difficulty with /r/ and Techniques for Dealing with this Phoneme

    Answer. The /r/ phoneme is one of the most difficult phonemes to remediate for clients with persistent, long term /r/ problems. Identifying the exact nature of the problem with the /r/ production will allow you to choose appropriate remediation strategies for your client. Typical problems with incorrect /r/ productions include: rounding the ...

  13. Fluency Disorders

    A fluency disorder is an interruption in the flow of speaking characterized by atypical rate, rhythm, and disfluencies (e.g., repetitions of sounds, syllables, words, and phrases; sound prolongations; and blocks), which may also be accompanied by excessive tension, speaking avoidance, struggle behaviors, and secondary mannerisms. People with fluency disorders also frequently experience ...

  14. Phoneme

    phoneme, in linguistics, smallest unit of speech distinguishing one word (or word element) from another, as the element p in "tap," which separates that word from "tab," "tag," and "tan.". A phoneme may have more than one variant, called an allophone (q.v.), which functions as a single sound; for example, the p 's of "pat ...

  15. Phonemic Inventories and Cultural and Linguistic Information Across

    Languages across the world have unique phonemic systems. For individuals learning English as a second language, it is common for the phonemic system of their first language to influence the production of sounds in English. Resources listed below are intended to contribute to foundational awareness of potential cultural and linguistic influences.

  16. Vowels, Vowel Formants and Vowel Modification

    In both singing and speech, optimal vowel phonemes are voiced, and are tense and therefore particularly distinct. Optimal consonants, on the other hand, are voiceless and lax. ... For this reason, most symbols are either Latin or Greek letters, or modifications thereof. The sound values of modified Latin letters can often be derived from those ...

  17. PDF ACCENT REDUCTION

    phonemes. 20. Present pictures, diagrams, or computer simulations of the positions of the articulators to the client as models of the placement needed for specific phonemes. 21. Use phonemes with similar dis-tinctive features to elicit target phonemes (e.g., ask the client to repeat the aspirated /t/ phoneme quickly to produce the /s/ pho-

  18. 1

    Most speech is produced by an air stream that originates in the lungs and is pushed upwards through the trachea (the windpipe) and the oral and nasal cavities. During its passage, the air stream is modified by the various organs of speech. Each such modification has different acoustic effects, which are used for the differentiation of sounds.

  19. A comparison of phonological and articulation-based approaches to

    ABSTRACT. Objectives: The purpose of the current study was to compare the effectiveness of a phonologically based accent modification treatment to an articulation/motor based treatment with Spanish-speaking adult learners of English using a small group model. In addition, the study examined which approach was more effective at treating specific phonological transfers and specific segments and ...

  20. Combining spectral and temporal modification techniques for speech

    Many algorithms proposed for speech modification operate by redistributing speech energy across the spectrum, either locally or from earlier or later portions of the signal. ... In order to determine whether individual consonants or vowels benefitted preferentially from spectral or temporal modification, a phoneme-level analysis of listener ...

  21. Speech Sound Disorders: Articulation and Phonology

    Articulation disorders focus on errors (e.g., distortions and substitutions) in production of individual speech sounds. Phonological disorders focus on predictable, rule-based errors (e.g., fronting, stopping, and final consonant deletion) that affect more than one sound. It is often difficult to cleanly differentiate between articulation and ...

  22. Modification of phonemes in speech

    Every phoneme displays a vast range of variation in connected speech. Among the different types of variation we distinguish idiolectal / 'idiəlektl /- индивидуальный, diaphonic and allophonic variation. Idiolectal variation embraces the individual peculiarities of articulating sounds, which are caused by the shape and form of the speaker's speech organs and by his articulatory ...

  23. Modification of phonemes in connected speech

    Modification of phonemes in connected speech. Each sound pronounced in isolation has 3 stages in its articulation: a) the organs of speech move to the position which is necessary to pronounce the sound. It's called initial stage or on-glide. b)the organs of speech are kept for some time in this position -the medial stage, stop stage, the hold.

  24. Investigating Disentanglement in a Phoneme-level Speech Codec for

    Most of the prevalent approaches in speech prosody modeling rely on learning global style representations in a continuous latent space which encode and transfer the attributes of reference speech. However, recent work on neural codecs which are based on Residual Vector Quantization (RVQ) already shows great potential offering distinct advantages. We investigate the prosody modeling ...