Why music training may play an essential role in the future of hearing healthcare | Hearing Review August 2014

Intriguing research continues to focus on music, the brain, and music’s potential in honing auditory acuity, including speech-in-noise performance and the enhancement of listening abilities. This article reviews many of these exciting findings and looks at clinical implications for auditory training and aural rehabilitation.

By Nina Kraus, PhD, and Samira Anderson, AuD, PhD

Music has always played an important role in the human experience. Archeologists have discovered bone flutes in Germany dating back to the Stone Age, demonstrating that a music tradition had already been established when humans settled in Europe more than 35,000 years ago.1 Music has well-known effects on mood and can even reduce pain or distress in individuals with serious illnesses.2,3

Beyond its mood-altering effects, other benefits have been ascribed to music, especially in individuals who actively make music. Engagement in music activities is purported to improve skills in many areas, including but not limited to memory, attention, spatio-temporal skills, language, social skills, and mathematical ability.

The OPERA Hypothesis

In the Auditory Neuroscience Laboratory at Northwestern University, we are interested in music as a form of long-term auditory training that induces learning-associated neuroplasticity with benefits that may extend to clinical populations. Our studies are based on the assumption that the effects of music training translate to enhanced neural encoding of speech stimuli.

This assumption is rooted in the tenets of the “OPERA Hypothesis” proposed by Patel4 to explain why music activity benefits language skills. This hypothesis assumes that music training will bring about adaptive plasticity in speech processing under the following conditions:

The biological circuitry that processes sound is common to speech and music;
The neural processing in these shared anatomic networks is more precise for music than for speech;
Music activities that engage these networks invoke strong emotions;
Musicians engage in frequent, repetitive practice; and
Focused attention is necessary to achieve progress.

Music Training Extends to Speech Encoding

To directly look for a neural correlate to the known music/speech connection, we used the auditory brainstem response to complex sounds (cABR) to examine the crossover effects of music training on the neural encoding of speech. The cABR has several features that make it particularly suited to this application.

First, the cABR waveform reflects the stimulus waveform with remarkable precision. In fact, if the response waveform is converted to a sound file, it sounds like the original stimulus.5 This similarity between stimulus and response waveforms allows us to compare the encoding of stimulus features, including pitch, timing, and timbre, in musicians and nonmusicians. Second, similar to the click-evoked ABR, the cABR has high test-retest reliability6,7  and is interpretable in individuals.

Finally, the cABR is modulated by past experience. The inferior colliculus, the putative generator of the cABR,8 is the site of multiple ascending and descending neural connections9 and is a hub for auditory learning.10 Therefore, cABR characteristics may be enhanced given enriched opportunities for making sound-to-meaning connections,11 such as one would experience when growing up in a bilingual environment or when learning to play a musical instrument. Conversely, cABR characteristics may be degraded in an individual who has grown up in an impoverished environment with limited auditory enriching experiences.12 In addition to changes occurring from years of experience to languages13,14 or musical training,15-17 more modest changes in the cABR can occur rapidly in as little as weeks or months.18-22

In summary, the cABR’s high stimulus fidelity, reliability, and plasticity make it an ideal vehicle for examining the effects of musical training on auditory processing.

Musacchia et al23 recorded brainstem responses to the syllable [da] and to a cello note in both auditory only and audiovisual conditions in young adults. The audiovisual condition combined the auditory waveforms and visual components (a speaker uttering the syllable [da] and a musician bowing a G note on the cello). Compared to nonmusicians, musicians had enhanced representation of sound in response to both the [da] syllable and the cello note in auditory and audiovisual conditions.

Wong et al24 recorded brainstem responses in musicians and nonmusicians to pitch contours that would be linguistically meaningful in Mandarin Chinese, a tonal language. Previous work by Krishnan et al25 had shown that Mandarin Chinese speakers have more accurate pitch encoding for lexical tones than do speakers of nontonal languages like English.Krishnan and colleagues found that monolingual, English-speaking musicians had more accurate brainstem pitch tracking compared to nonmusicians and that the strength of pitch tracking positively correlated with years of music training.

Taken together, these studies demonstrate that the benefits of musical training extend to the neural encoding of speech.

Musician Effects in Young Adults

Are there implications for music training affecting the ability to hear and understand speech in real-world environments? To answer this question, Parbery-Clark and colleagues26 assessed speech-in-noise performance in young adult musicians and nonmusicians. The musicians started musical instrument training before age 7 and had at least 10 years of consistent musical practice. Nonmusicians had not received any musical training within the 7-year period before the study. Speech-in-noise performance was measured with commonly used clinical measures: the Quick Speech-in-Noise (QuickSIN) test27 and the Hearing in Noise Test (HINT).28 Related auditory skills, auditory working memory, and frequency discrimination were also tested. They found that musicians had better scores on the perceptual and cognitive measures, with lower thresholds on the QuickSIN, HINT, and frequency discrimination, and higher scores on auditory working memory.

Parbery-Clark et al15 also evaluated neural encoding of speech in noise using the cABR,recording responses to the speech syllable [da] presented in quiet and in a babble of six talkers. They found that musicians’ responses were more resistant to the degradative effects of noise. Specifically, in noise the musicians had earlier peak latencies and higher spectral amplitudes than nonmusicians. The researchers also cross-correlated the response with the stimulus to obtain a measure of response fidelity and morphology. While the resulting stimulus-to-response correlations were similar in quiet between musicians and nonmusicians, in noise musicians had higher correlations (Figure 1).

MATLAB Handle Graphics

Figure 1. Musicians have enhanced auditory skills—better speech-in-noise and higher auditory working memory scores. In quiet, musicians’ brainstem peak latencies and stimulus-to-response correlations are equivalent to those of nonmusicians, but in background noise there is less degradation in the musicians’ responses, resulting in less neural delay and a maintenance of stimulus-to-response correlations. **p < 0.01. Adapted from Parbery-Clark et al.

Bidelman and Krishnan17 found that musicians’ brainstem responses were also more resistant to the deleterious effects produced by reverberation compared to nonmusicians.The fact that brainstem encoding of speech is more resistant to the effects of noise in musicians suggests a mechanism for their enhanced ability on speech-in-noise tests.

Some have questioned whether this enhancement is due to innate ability rather than to actual training. The correlation between number of years of training and performance on the QuickSIN suggests that the enhancement is due, at least in part, to experience.

Musician Effects in Children and Toddlers

Are these musician advantages also seen in children who have had fewer years of musical training? Strait et al29 recently spearheaded a series of cross-sectional studies that examined the effects of music training in preschool children, school-age children, and young adults. They found that school-age children (ages 7-13) who started music lessons before age 5 and had at least 4 years of musical training had better hearing of speech in noise and higher scores on auditory working memory than nonmusicians.16 They also evaluated speech-in-noise encoding using the speech syllable [da] presented in a 6-talker babble background. They found “musician × condition interactions” for both response fidelity and response timing—meaning that, in musicians, a robust brainstem response is more resistant to the degradative effects of noise (Figure 2). Spectral encoding was also enhanced in musicians for the second through eighth harmonics in both quiet and noise conditions. Finally, relationships were found among neural, perceptual, and cognitive measures.

Figure 2a-c. 2A: Child musicians have better speech-in-noise perception and higher auditory working memory scores; 2B: Child musicians have less response degradation in noise than nonmusicians, reflected in the interaction plot in 2C.  *p < 0.05, **p < 0.01. Adapted from Strait et al.

Figure 2a-c. 2A: Child musicians have better speech-in-noise perception and higher auditory working memory scores; 2B: Child musicians have less response degradation in noise than nonmusicians, reflected in the interaction plot in 2C. *p < 0.05, **p < 0.01. Adapted from Strait et al.

Higher spectral amplitudes were found in children with better hearing in noise and in children with higher auditory working memory and attention scores, suggesting a possible mechanism for improved auditory function in musician children.  Music training requires active engagement with sound and lays a foundation for strong sound-to-meaning connections.11 The cognitive functions of attention and memory are activated as the musician focuses attention on the fine details of sound and memorizes musical passages. Cognitive-sensory interactions promote neural plasticity and enhanced processing for meaningful stimuli.

These childhood musician advantages can occur with a relatively short period of training. Preschool children (ages 3 to 5) who had 1 or more years of music lessons have earlier brainstem peak latencies in quiet and in noise and reduced timing delays in noise compared to children who have had no music experience.29 However, no differences were noted for stimulus-to-response correlations or spectral amplitudes, suggesting that further training and development is required for these differences to emerge.

A subset of these children returned 1 year later, allowing for a longitudinal analysis of the data. Peak onset amplitude is particularly compromised in noise, and musician children had more noise-resistant onset responses after 1 year of training while no changes were noted in children with no music lessons. Although the possibility of an innate predisposition cannot be ruled out, these data support the malleability of biological speech processing with training. And some of these neural benefits can be seen in infants after just 6 months of music participation.30

Musician Effects in Older Adults

Given knowledge that musical training confers advantages for speech processing in toddlers, children, and young adults, do these benefits extend to older adults? Older adults often report hearing-in-noise difficulties, even when audiometric thresholds are normal.31,32 These difficulties may arise from age-related temporal processing deficits that have been documented using behavioral33 and electrophysiological34-36 studies.

Figure 3a-c. Musicianship offsets delayed neural timing in older adults. 3A: Stimulus waveform marked with three time regions: onset, transition, and steady state. 3B: Older musicians have delayed onsets, but there are no delays for the transition or steady state relative to younger musicians. 3C: Older nonmusicians have delayed peak latencies for both the onset and the transition regions, but not the steady state. ~p < 0.10, *p < 0.05, **p < 0.01. Adapted from Parbery-Clark et al.

Figure 3a-c. Musicianship offsets delayed neural timing in older adults. 3A: Stimulus waveform marked with three time regions: onset, transition, and steady state. 3B: Older musicians have delayed onsets, but there are no delays for the transition or steady state relative to younger musicians. 3C: Older nonmusicians have delayed peak latencies for both the onset and the transition regions, but not the steady state. ~p < 0.10, *p < 0.05, **p < 0.01. Adapted from Parbery-Clark et al.

In a comparison of younger and older musicians and nonmusicians with normal hearing, we found that musicianship offsets age-related temporal processing deficits, at least in part.37 We identified peak latencies in the brainstem responses to speech syllables and found that, although the onset peak latency was significantly delayed in older musicians and nonmusicians, the peak latencies for the consonant-vowel transition were not significantly delayed (Figure 3). This region of the syllable is vulnerable in noise,38,39 and peak latencies in this region are selectively delayed in older adults.34 An advantage for speech-in-noise perception was found—older musicians have better speech-in-noise performance (QuickSIN) and better auditory working memory than older nonmusicians.40,41

Childhood Music and the Adult Brain

These neural enhancements for speech-in-noise performance were found in older adults who had been regularly playing music all their lives. But what about those who played an instrument while in grade school or high school but then stopped practicing when life got busy? Is there any prophylactic effect from the typical music training that occurs in childhood?

We recorded brainstem responses to the speech syllable /da/ presented in quiet and in two-talker babble in three groups of older adults: 1) those who had no musical training at any point in their lives; 2) those who had 1 to 3 years of training; and 3) those who had 4 to 14 years of training. We compared peak latencies in these groups and found that the older adults with more years of training had earlier peak latencies than those who had little or no training. These initial music training experiences may set the stage or prime the auditory system to benefit from subsequent auditory experiences outside of the music context.42

These results provide powerful support for the provision of music education as part of the regular school curriculum.

Clinical Implications

Audiologists are not in the habit of considering music expertise in the context of audiological diagnosis and management, so what are the clinical implications of these findings?

Because musicians have enhanced auditory skills and experience with sound, they may be more sensitive to subtle declines in abilities that may not be readily apparent on a typical audiological examination. For example, a musician might report having trouble hearing in noise but have a QuickSIN score that indicates a normal or near-normal degree of SNR loss; nevertheless, this score might be quite a bit worse than his performance at a younger age. So, his decline in performance would be quite noticeable to him, even though it wouldn’t raise concerns in a hearing care professional not sensitive to his background. Therefore, as clinicians, we need to be attentive to these concerns and to consider counseling and management options for the decline in performance.

Another concern for hearing care professionals is the provision of amplification suitable for the appreciation of music. Hearing aids are designed to maximize speech audibility, but this approach is not optimal for music. Speech and music have several acoustical differences (see the articles by Beck and Chasin in this edition): The duration of musical notes is much longer than that of speech phonemes and there are slower changes in pitch than in speech. Therefore, the typical attack and release times used for amplification of speech may not be appropriate for music listening.43 Most hearing aids address this problem through the use of dedicated “music” programs, but these programs may not always produce satisfactory results—especially for musicians. A better understanding of the differences in neural processing between musicians and nonmusicians may help us to address listening challenges in our musician patients.

Finally, although the effects of music training initiated in older adulthood have not yet been documented, we do know that short-term computer-based training can improve subcortical neural timing and speech-in-noise performance in older adults.21 Given the crossover benefits stated in the OPERA hypothesis, a music-based auditory training program may provide significant benefits for neural processing and speech perception in older adults. Because music can be intrinsically reinforcing, it may provide sufficient motivation for perseverance with the training.

music author box  kraus Acknowledgments

This work is supported by the National Science Foundation (NSF BCS-0921275; 0842376) and the Knowles Hearing Center.


1. Conard NJ, Malina M, Münzel SC. New flutes document the earliest musical tradition in southwestern Germany. Nature. 2009;460(7256):737-740.

2. Stanczyk MM. Music therapy in supportive cancer care. Rep Pract Oncol Radiother. 2011;16(5):170-172.

3. Hartling L, et al. Music to reduce pain and distress in the pediatric emergency department: a randomized clinical trial. JAMA Pediatrics. 2013;167(9):826-835.

4. Patel AD. Why would musical training benefit the neural encoding of speech? The OPERA hypothesis. Front Psychol. 2011;2:142.

5. Galbraith GC, et al. Intelligible speech encoded in the human brain stem frequency-following response. Neuroreport. 1995;6(17):2363-2367.

6. Song JH, Nicol T, Kraus N. Test–retest reliability of the speech-evoked auditory brainstem response. Clin Neurophysiol. 2011;122(2):346-355.

7. Hornickel J, Knowles E, Kraus N. Test-retest consistency of speech-evoked auditory brainstem responses in typically-developing children. Hear  Res. 2012;284(1-2):52-58.

8. Chandrasekaran B, Kraus N. The scalp-recorded brainstem response to speech: neural origins and plasticity. Psychophysiology. 2010;47(2):236-246.

9. Gao E, Suga N. Experience-dependent plasticity in the auditory cortex and the inferior colliculus of bats: role of the corticofugal system. Proc Natl Acad Sci-USA. 2000;97(14):8081-8086.

10. Bajo VM, King AJ. Cortical modulation of auditory processing in the midbrain. Frontiers in Neural Circuits. 2013;6:114. Available at: http://journal.frontiersin.org/Journal/10.3389/fncir.2012.00114/full

11. Kraus N, Chandrasekaran B. Music training for the development of auditory skills. Nat Rev Neurosci. 2010;11(8):599-605.

12. Skoe E, Krizman J, Kraus N. The impoverished brain: disparities in maternal education affect the neural response to sound. J Neurosci. 2013;33(44):17221-17231.

13. Krishnan A, Gandour JT, Bidelman GM. The effects of tone language experience on pitch processing in the brainstem. J Neurolinguistics. 2010;23(1):81-95.

14. Krizman J, et al. Subcortical encoding of sound is enhanced in bilinguals and relates to executive function advantages. Proc Natl Acad Sci-USA. 2012;109(20):7877-7881.

15. Parbery-Clark A, Skoe E, Kraus N. Musical experience limits the degradative effects of background noise on the neural processing of sound. J Neurosci. 2009;29(45):14100-14107.

16. Strait DL, et al. Musical training during early childhood enhances the neural encoding of speech in noise. Brain Lang. 2012;123(3):191-201.

17. Bidelman GM, Krishnan A. Effects of reverberation on brainstem representation of speech in musicians and non-musicians. Brain Res. 2010;1355:112-125.

18. Song JH, et al. Plasticity in the adult human auditory brainstem following short-term linguistic training. J Cognitive Neurosci. 2008;20(10):1892-1902.

19. Song JH, et al. Training to improve hearing speech in noise: biological mechanisms. Cereb Cortex.  2012;22:1180-1190.

20. Carcagno S, Plack C. Subcortical plasticity following perceptual learning in a pitch discrimination task. J Assoc Res Otolaryngol. 2011;12(1):89-100.

21. Anderson S, et al. Reversal of age-related neural timing delays with training. Proc Natl Acad Sci-USA.  2013;110(11):4357-4362.

22. Anderson S, et al. Training changes processing of speech cues in older adults with hearing loss. Front Syst Neurosci. 2013;7:97.

23. Musacchia G, et al. Musicians have enhanced subcortical auditory and audiovisual processing of speech and music. Proc Natl Acad Sci-USA. 2007;104(40):15894-15898.

24. Wong PCM, et al. Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nature Neurosci. 2007;10(4):420-422.

25. Krishnan A, et al. Encoding of pitch in the human brainstem is sensitive to language experience. J Cognitive Neurosci. 2005;25(1):161-168.

26. Parbery-Clark A, et al. Musician enhancement for speech-in-noise. Ear Hear. 2009;30(6):653-661.

27. Wilson RH, McArdle RA, Smith SL. An evaluation of the BKB-SIN, HINT, QuickSIN, and WIN materials on listeners with normal hearing and listeners with hearing loss. J Speech Lang Hear Res. 2007;50(4):844-856.

28. Nilsson M, Soli S, Sullivan JA. Development of the Hearing In Noise Test for the measurement of speech reception thresholds in quiet and in noise. J  Acoust Soc Am. 1994;95(2):1085-1099.

29. Strait DL, et al. Musicians’ enhanced neural differentiation of speech sounds arises early in life: developmental evidence from ages 3 to 30. Cerebral Cortex. April 18, 2013. Available at: http://cercor.oxfordjournals.org/content/early/2013/04/17/cercor.bht103.full

30. Gerry D, Unrau A, Trainor LJ. Active music classes in infancy enhance musical, communicative and social development. Developmental Science. 2012;15(3):398-407. Available at: http://onlinelibrary.wiley.com/doi/10.1111/j.1467-7687.2012.01142.x/abstract

31.Hargus SE, Gordon-Salant S. Accuracy of speech intelligibility index predictions for noise-masked young listeners with normal hearing and for elderly listeners with hearing impairment. J Speech Hear Res. 1995;38(1):234-243.

32.Souza P, et al. Prediction of speech recognition from audibility in older listeners with hearing loss: effects of age, amplification, and background noise. J Am Acad Audiol. 2007;18:54-65.

33. Gordon-Salant S, Fitzgibbons PJ, Friedman SA. Recognition of time-compressed and natural speech with selective temporal enhancements by young and elderly listeners. J Speech Lang Hear Res. 2007;50(5):1181-1193.

34. Anderson S, et al. Aging affects neural precision of speech encoding. J Neurosci. 2012;32(41):14156-14164.

35. Tremblay K, Piskosz M, Souza P. Effects of age and age-related hearing loss on the neural representation of speech cues. Clin Neurophysiol. 2003;114:1332-1343.

36. Harris KC, et al. Age-related differences in gap detection: effects of task difficulty and cognitive ability. Hear Res. 2010;264(1-2):21-29.

37. Parbery-Clark A, et al. Musical experience offsets age-related delays in neural timing. Neurobiology of Aging. 2012;33(7):1483. doi:10.1016/j.neurobiolaging.2011.12.015.

38. Miller GA, Nicely PE. An analysis of perceptual confusions among some English consonants. J Acoust Soc Am. 1955;27(2):338-352.

39. Anderson S, et al. Neural timing is linked to speech perception in noise. J Neurosci. 2010;30(14):4922-4926.

40. Parbery-Clark A, et al. Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise. Plos ONE.  2011;6(5):e18082.

41. Zendel BR, Alain C. Musicians experience less age-related decline in central auditory processing. Psychol Aging. 2012;27:410-417.

42. Chasin M, Hockley NS. Some characteristics of amplified music through hearing aids. Hear Res. 2014;308:2-12

Original citation for this article: Kraus N, Anderson S. Music benefits across lifespan: Enhanced processing of speech in noise. Hearing Review. 2014;21(8):18-21.