Human subcortical auditory function provides a new conceptual framework for considering modularity - Oxford Scholarship Jump to ContentJump to Main Navigation
Language and Music as Cognitive Systems$

Patrick Rebuschat, Martin Rohmeier, John A. Hawkins, and Ian Cross

Print publication date: 2011

Print ISBN-13: 9780199553426

Published to Oxford Scholarship Online: January 2012

DOI: 10.1093/acprof:oso/9780199553426.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE ( (c) Copyright Oxford University Press, 2015. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see Subscriber: null; date: 28 August 2016

Human subcortical auditory function provides a new conceptual framework for considering modularity

Human subcortical auditory function provides a new conceptual framework for considering modularity

(p.269) Chapter 28 Human subcortical auditory function provides a new conceptual framework for considering modularity
Language and Music as Cognitive Systems

Erika Skoe

Nina Kraus

Oxford University Press

Abstract and Keywords

This chapter comments on the discussion in Chapter 27. It argues that the modularity debate has centred on the cortex, and seeks to widen the theoretical discussion of modularity by presenting research on the role of subcortical structures in auditory processing. It suggests that human subcortical auditory function can provide a new conceptual framework for studying modularity.

Keywords:   modularity, cortex, subcortical structures, auditory processing, human auditory function

The central theme to Isabelle Peretz's large body of scientific work is that language and music employ a unique set of neural substrates and networks. Peretz's extensive work with congenital amusia suggests that while pitch is common to speech and music, aspects of pitch processing are unique to the domain of music. While her focus in this book is on vocal production and not perception, her argument takes a wider scope and extends to the general question of modularity. Peretz, by her own account, is not a radical modularist; she does not argue that all aspects of language and music processing are modular. Instead, she acknowledges that language and music are indirectly associated or partially overlapping functions, while still defending the position that music processing invokes specialized operations. Our commentary will spotlight a few key aspects of her argument, in addition to addressing several of her secondary remarks. Our intention here is to expand the theoretical discussion of modularity by presenting work from our laboratory1 and others that refocus the debate on neural structures outside the cortex.

The debate over the modularity of language and music has been largely cortically centred. The reason is simple: until recently, little was known about how subcortical structures processed behaviourally-relevant signals such as speech and music and how subcortical tuning could be affected by auditory experiences.2 Subcortical structures, and their interaction with cortical processes, provide a new vehicle for answering old questions, including offering a new perspective on the modularity of music and speech. Zatorre and Gandour (2008) echo this judgement in their review article directed at (p.270) rectifying the seemingly contradictory view that speech can invoke both domain-general mechanisms and general-purpose mechanisms. They write:

The key to reconciling these phenomena probably lies in understanding the interactions between afferent pathways that carry stimulus information, with top-down processing interactions that modulate these processes (p. 1087).

There are many parallels between the arguments laid out by Zatorre and Gandour and Patel's resource-sharing framework. For Patel, music and language involve domain-specific representations, which under certain circumstances invoke identical neural substrates and functional networks. Because Peretz and Patel are both featured in this volume and because Peretz dedicates a section of her article to the resource-sharing model, this manuscript will also serve as an indirect commentary on Patel's framework.

In this commentary, we will review data from clinical populations (language-based learning disorders (LD), autism spectrum disorders, and amusia), and expert populations (musicians and speakers of tonal languages),3 that speak to domain specificity, resource sharing or their union. As an organizing structure for this commentary, section headings represent summaries or direct excerpts from Peretz's chapter.

‘Music specificity should be examined for each subsystem or processing component’

The perception and production of speech and music invoke a vast overlapping network of neural structures that must operate independently and in tandem to achieve normal, non-disordered results. Modularity, therefore, could exist at multiple stages of auditory processing, and as Peretz suggests, a closer examination of each level is in order. To that end, we will be putting subcortical structures (specifically the brainstem) under the proverbial microscope to examine their roles in the modularity of speech and music.

The auditory brainstem, an ensemble of subcortical nuclei belonging to the efferent and afferent auditory systems, has an obligatory role in the processing of all sounds entering the cochlea, making it a common way station in both speech and music processing. Brainstem function can be appraised using scalp-electrodes that detect electrophysiological potentials generated by the synchronous activity of populations of subcortical neurons (Chandrasekaran & Kraus, 2009; Skoe & Kraus, 2010). This response, known as the auditory brainstem response (ABR), provides a means for objectively and non-invasively studying the neural encoding of speech and music. This is because the ABR represents the acoustic properties of the sound stimulus (e.g. pitch, timing, and timbre) with extraordinary fidelity (Galbraith, Arbagey, Branski, Comerci, & Rector, 1995; Akhoun et al. 2008; Skoe & Kraus, 2010). In fact, ABRs look like (see Figure 28.1), and when converted to an audio signal and played back, sound like, the evoking stimulus (Galbraith et al., 1995; Skoe & Kraus, 2010). By comparison, functional magnetic resonance imaging (fMRI) and cortical-evoked electrophysiological responses provide a more abstract representation of the evoking stimulus. Moreover, brainstem function can be meaningfully evaluated in an individual subject. Another appealing aspect is that ABRs to speech and music are correlated with performance on tests of (p.271)

                      Human subcortical auditory function provides a new conceptual framework for considering modularity

Fig. 28.1 Domain-specificity of the ABR: Subcortical representation of speech is selectively impaired in children with language-based learning disorders (LD). Auditory brainstem response to a 40-ms syllable ‘da’ in LD (grey) and normal learning (NL) children (black). Top: time domain waveforms with characteristic response peaks labelled. The timing of the LD response is delayed relative to the NLs. The stimulus waveform is plotted above to illustrate the strong visual coherence between the stimulus and response waveforms. Bottom: response spectra (frequency domain waveforms). Spectral peaks corresponding to the first formant of the speech stimulus are significantly reduced in LDs. In contrast, the neural representation of the stimulus pitch (i.e. fundamental frequency) is not abnormal in LD children. Adapted from Banai, K,Hornickel, J., Skoe, E., Nicol, T., Zecker, S., & Kraus, N. Reading and Subcortical auditory function. Cerebral Cortex, 19(11), pp. 2699–707. © 2009, Oxford University Press, with permission.

speech and music perception (Musacchia, Strait, & Kraus, 2008; Chandrasekaran, Hornickel, Skoe, Nicol, & Kraus, 2009; Bidelman & Krishnan, 2009; Hornickel, Skoe, Nicol, Zecker, & Kraus, 2009b), yet can be elicited passively without the subject performing a task that unavoidably engages high-order domain-general cognitive functions (e.g. attention and auditory memory). For more theoretical and methodological insights, see Galbraith (2008), Chandrasekaran and Kraus (2009), Kraus and Chandrasekaran 2010, and Skoe and Kraus (2010).

Galbraith was the first to recognize the dynamic nature of sensory processing in the human brainstem (Galbraith et al., 1995), and he laid the foundation for much of the ensuing research (Banai & Kraus, 2008; Chandrasekaran et al., 2009). There is now a (p.272) wealth of evidence indicating that the brainstem is sensitive to linguistic and musical information. In addition to showing that musicians have heightened subcortical representations of music (Musacchia, Sams, Skoe, & Kraus, 2007) including features that are relevant for melody recognition (Lee, Skoe, Kraus, & Ashley, 2009), ABRs from non-musicians provide evidence that hierarchical representations of musical pitch (i.e. consonance-dissonance continuum) are rooted in subcortical processing (Bidelman & Krishnan, 2009). However, language-specific processes are also evident subcortically. For example, compared to non-speech signals, speech evokes larger ABR amplitudes (Galbraith et al., 1995) and elicits more robust group differences (Musacchia et al., 2007; Swaminathan, Krishnan, & Gandour, 2008). Likewise, we see that language experience primes brainstem tuning in a highly specific manner, as evidenced by enhanced pitch tracking in tonal language speakers for behaviourally-relevant, prototypical linguistic pitch contours (Krishnan, Xu, Gandour, & Cariani, 2005; Xu, Krishnan, & Gandour, 2006). The notion that language-dependent operations might have subcortical origins is also supported by research from Hornickel, Skoe, and Kraus (2009a) who found that the temporal and formant-related elements of the speech signal are preferentially encoded in the right-ear stimulated ABR, but that pitch is not. This finding is consistent with an animal model (King, Nicol, McGee, & Kraus, 1999) and work in the human auditory periphery (Sininger & Cone-Wesson, 2004) and brainstem (Levine, Liederman, & Riley, 1988; Sininger & Cone-Wesson, 2006). Taken together, this line of research suggests that lateralization of speech occurs outside the cortex.

The association between brainstem activity and language function is also evident when language is impaired. A subset of children with language-based learning disorders presents with irregular subcortical representations of timing and speech formants despite normal pitch representations (Figure 28.1) (Wible, Nicol, & Kraus, 2005; Banai et al., 2009; Hornickel et al., 2009b). This pattern is consistent with the phonological processing problems inherent to reading disorders, and the influential theory that reading impairments are symptomatic of a more generalized timing disorder (Tallal et al., 1996; Corriveau & Goswami, 2009). This dissociation of pitch and temporal processing may also explain why children with language disabilities perform poorly on musical timing tasks, but not pitch tasks (Overy, Nicolson, Fawcett, & Clarke, 2003). Work is underway in our laboratory to more fully understand the relationships between literacy and music aptitude and how they relate to behavioural and neurophysiological patterns in this population (see also Anvari, Trainor, Woodside, & Levy, 2002; Tallal & Gaab, 2006).

Domain-transfer effects between speech and music do not necessarily implicate a direct interaction. They may be mediated by a third party such as ‘executive function, domain-general attentional or corticofugal influences’

Work from our laboratory fits within the larger scientific literature showing that musical experience has profound effects on the nervous system that extend beyond music. In addition to demonstrating domain-transfer effects between music and speech in vocal production (Stegemöller, Skoe, Nicol, Warrier, & Kraus, 2008), our (p.273)

                      Human subcortical auditory function provides a new conceptual framework for considering modularity

Fig. 28.2 Evidence of domain-transfer effects: musical training is associated with enhanced subcortical representation of speech. Grand average brainstem responses to a 350-ms speech syllable ‘da’ for both musician (grey) and non-musician (black) groups. Top: amplitude differences between the groups are evident over the entire response waveform. Inset: Musicians exhibit faster (i.e. earlier) onset responses. The large response negativity (circled region) occurs on average ∼0.50 ms earlier for musicians. Bottom: musicians have more robust amplitudes of the fundamental frequency (100 Hz) and harmonics (e.g. 300, 400, 500 Hz). Adapted from Musacchia, G, Sams, M., Skoe, E., & Kraus, N. Musicians have enhanced subcortical auditory and audiovisual processing of speech and music. Proceedings of the National Academy of Sciences of the USA, 104(40), pp. 15894–98.

research has revealed that musical training is associated with enhanced subcortical representation of linguistic stimuli (Figure 28.2) under both optimal (Musacchia et al., 2007; Wong et al., 2007; see also Bidelman, Gandour, & Krishnan, 2011) and not optimal listening conditions (i.e. background noise) (Parbery-Clark, Skoe, & Kraus, 2009b; see also Bidelman and Krishnan, 2010). Importantly, these subcortical enhancements are not specific to linguistic stimuli and occur to non-linguistic, emotionally-rich vocal sounds as well (Strait, Kraus, Skoe, & Ashley, 2009). This transfer effect may be an indirect outcome of shared subcortical resources for speech and music. Musical experiences may enhance how music is represented in the brainstem (Musacchia et al., 2007; Bidelman et al., 2011; Lee et al., 2009) and as by-product other behaviourally-relevant signals are also fine-tuned. This link between music and speech is reinforced by correlational analysis showing that subcortical enhancements of speech and other vocal signals vary as a function of the extent and onset of musical experience (Musacchia et al., 2007; Wong et al., 2007; Strait et al., 2009).

(p.274) We have previously argued that this context-general subcortical tuning results from the functional interplay between subcortical structures and high-order cognitive processes, with the corticofugal system serving as the biological mediator of this reciprocal communication (Musacchia et al., 2007; Wong et al., 2007; Parbery-Clark et al., 2009b; Kraus and Chandrasekaran, 2010; for a discussion of these reciprocal processes in an animal model see Malmierca, Cristaudo, Perez-Gonzalez, & Covey, 2009). This argument is grounded in the fact that the auditory system is a two-way street. In addition to the vast track of ascending fibres from the cochlea to the cortex, there is also an extensive chain of descending fibres linking higher neural structures to lower ones (Winer, 2006; Kral & Eggermont, 2007). In the animal model, the corticofugal system works to fine-tune subcortical auditory processing of behaviourally-relevant sounds (Luo, Wang, Kashani, Yan, 2008) by binding together learned representations and the neural transcription of specific acoustic features. This can lead to short-term plasticity and eventually long-term reorganization of subcortical sound encoding (for a review, see Suga, Xiao, Ma, & Ji, 2002). The direct involvement of the corticofugal system in human auditory processing has been demonstrated by Perrot and his colleagues (Perrot et al., 2006) who showed that stimulating the auditory cortex results in suppressed contralateral cochlear emissions. Moreover, the putative role of the corticofugal system in shaping human subcortical activity can also be inferred from training studies (Russo, Nicol, Zecker, Hayes, & Kraus, 2005; de Boer & Thornton, 2008; Song, Skoe, Wong, & Kraus, 2008b; Carcagno and Plack 2011; Song, Skoe, Banai, and Kraus, 2011) and experimental paradigms employing contralateral noise stimulation (Micheyl, Khalfa, Perrot, & Collet, 1997; Perrot, Micheyl, Khalfa, & Collet, 1999). Thus, corticofugal modulation can be seen as a powerful mechanism for guiding neural plasticity, driving language-specific subcortical enhancements, and bolstering transfer effects from music to speech, and possibly vice versa. Likewise, the subcortical deficits we find in children with language impairments (e.g. dyslexia and autism) may arise from faulty or suboptimal corticofugal engagement of auditory activity (Russo et al., 2008; Song, Banai, & Kraus, 2008a; Banai et al., 2009; Strait & Kraus, in press).

While we do see strong correlations between subcortical processing of speech and music, we cannot use the currently available data or methodology to pinpoint whether the putative corticofugal connections between speech and music are direct, or whether they are driven entirely (or even partially) by attention, memory, or other general-purpose cortical functions. The link between brainstem malleability and general cognitive processes is, in fact, a well-grounded idea. For instance, subcortical structures are likely governed by attentional state (Galbraith & Doan, 1995; Galbraith et al., 1998, 2003; Rinne et al., 2008) and shaped by multi-sensory integration (Musacchia et al., 2006, 2007). Thus, because memory, abstract reasoning and attention are common to both speech and music, and because selective auditory attention may mediate broad and long-term learning (Moore, Rosenberg, & Coleman, 2005; Stevens, Fanning, Coch, Sanders, & Neville, 2008; Parbery-Clark et al., 2009a) that results from specific auditory training, brainstem processing of speech and music is likely influenced by top-down cognitive functions via shared corticofugal mechanisms. This suggests that ‘a third party’ could be involved in the subcortical domain-transfer effects we observe. In this light, corticofugal mechanisms certainly do adhere to Peretz's domain-general characterization. However, while corticofugal mechanisms may affect (p.275) subcortical processes globally, they can also be highly specific or ‘modular’. For example, musical experience and short-term auditory training do not result in a stimulus-independent, generalized gain-effect and likewise, impairment does not necessarily manifest as a pervasive disruption in brainstem processing. What we find instead is that certain sounds or certain aspects of the sounds are impaired (Cunningham, Nicol, Zecker, & Kraus, 2000; King, Warrier, Hayes, & Kraus, 2002; Wible, Nicol, & Kraus, 2004; Wible et al., 2005; Banai et al., 2009; reviewed in Banai & Kraus, 2008) or enhanced (Wong et al. 2007; Song et al., 2008b; Lee et al., 2009; Strait et al., 2009; Parbery-Clark et al. 2009b), with the behavioural relevance and relative complexity or difficulty of the stimulus likely influencing how the sensory system responds (Figure 28.3).

                      Human subcortical auditory function provides a new conceptual framework for considering modularity

Fig. 28.3 Selective enhancement of pitch-related brainstem function after training. After eight days of lexical-pitch training, native English speaking adults had more accurate subcortical pitch tracking for the most complex pitch contour (i.e. Mandarin Tone 3). A representative example of pre- and post-training pitch tracking is plotted, with the sold black line representing the stimulus pitch contour. Using the same stimulus set, Wong et al. (2007) found that musicians and non-musician groups were also best differentiated by this complex pitch contour. Adapted from Judy H. Song, Erika Skoe, Patrick C.M. Wong, and Nina Kraus, 'Plasticity in the Adult Human Auditory Brainstem following Short-term Linguistic Training', Journal of Cognitive Neuroscience, 20:10 (Oct, 2008), pp. 1892–1902 © 2008 by the Massachusetts Institute of Technology.

(p.276) Domain-transfer predicts that ‘speakers of tonal languages would be more musical than non-tonal language speakers … [and] the prevalence of tone-deafness in these cultures should be close to inexistent’

Although formal statistics currently do not exist, we do have first-hand evidence that tone-deafness is not completely absent among tonal-language speakers. We are currently evaluating brainstem function in a native Mandarin Chinese speaker with congenital amusia (female, 30 years old, classified by the Montreal Battery for the Evaluation of Amusia (Peretz, Champod, & Hyde, 2003)). Foxton and colleagues (Foxton, Dean, Gee, Peretz, & Griffiths, 2004) point out that deficits in the ascending pathway may theoretically play a role in congenital amusia; however, this has never been formally investigated. In addition to assessing subcortical pitch encoding in this subject, we seek a better understanding of the complex interaction between simultaneous pitch impairment and pitch expertise, and the crosstalk between different stages of auditory processing. Although we reserve a complete analysis for a different venue, our preliminary results do not provide evidence for pervasive abnormalities in subcortical pitch encoding, which supports the prevailing view that amusia has cortical origins (Hyde, Zatorre, Griffiths, Lerch, & Peretz, 2006; Braun et al., 2008). That said, our data also do not entirely rule out subcortical anomalies. For example, just like there is a spectrum of language impairment, there may be multiple subtypes of amusia with only certain subtypes showing poor subcortical pitch-tracking (Russo et al., 2008) and/or amusia may have more subtle effects on brainstem function than we have been able to observe with our methods.

Now turning to the question of heightened musicality: Despite behavioural evidence for a higher incidence of absolute pitch in Mandarin speakers (Deutsch, Henthorn, Marvin, & Xu, 2006), there has never been a formal evaluation of how absolute pitch affects subcortical pitch tracking. However, recent work from Bidelman and colleagues (Bidelman et al., 2011) comparing pitch tracking of music- and speech-based pitch tokens between Mandarin speakers and native English-speaking musicians and non-musicians, provides the first subcortical evidence for linguistic pitch expertise transferring to the musical domain. Although musicians and Mandarin speakers did not differ in their accuracy of pitch tracking for both speech and music, musicians did show stronger responses than Mandarin speakers did, but only for components of the Mandarin contour corresponding to notes along the musical scale. This suggests a complex interplay of domain-transfer and domain-dependent interactions in the musicians. In addition to elucidating transfer effects from speech to music, a better understanding of these issues would also aid in interpreting amusic pitch tracking.

This brings us to Peretz's point: if domain-transfer effects exist, this should ‘eradicate’ musical pitch processing anomalies in cultures that use pitch lexically. If we tackle this issue from a corticofugal perspective, we first need a better handle on whether congenital amusia is associated with subcortical deficits. If it is, then the next question is whether the malformation of cortical pitch centres feeds backward to alter subcortical pitch tracking or/and whether music-mediated corticofugal processes are absent (p.277) or not fully formed in amusics. However, if a large-scale study of amusia reveals no subcortical involvement, then we will need to address whether linguistic pitch expertise is sufficient to overcome or correct any overt subcortical pitch deficits (as might be inferred from Peretz's statement). At this time, there are, of course, still many unanswered questions; nevertheless, we do not view the existence of amusia in tonal language cultures as evidence for the nullification of transfer effects between speech and music.

One anatomical region can contain more than one distinct domain-specific network

While both Patel and Peretz maintain that it is possible for one anatomical substrate to contain functionally separate domain-specific networks, there is no evidence that the processing of speech and music are anatomically segregated within the brainstem (Patel 2011). We have illustrated above how corticofugal influences are both domain-general yet also work to shape aspects of sound processing in a modular fashion. This dual function of corticofugal modulation is the likely vehicle for brainstem structures having both domain-specific and domain-general functions. This suggests that brainstem nuclei could be inherently domain-general but become modularized with experience through the interaction of domain-specific and domain-general corticofugal networks. That is to say, some corticofugal pathways might reinforce modularity while others reinforce shared resources. Importantly, this duality could also account for the different degrees of modularity and/or resource sharing in different impaired and expert populations. These concepts are consistent with the idea that shared mechanisms modularize with experience (Saffran & Thiessen, 2006).

Summary and concluding remarks

Human subcortical auditory function provides a new conceptual framework for studying modularity. There is clear neural evidence for both shared and domain-specific subcortical processes, and our goal here was to try to rectify this apparent dichotomy. We have argued that corticofugal pathways can account for both sides of the coin and have outlined how cortical-subcortical interactions can give rise to both selective neurophysiological dissociations and enhancements, yet also result in overlap in activation between speech and music, and transfer effects. The corticofugal system is quite extensive and the prevailing evidence indicates that the number of efferent fibres far exceed the afferent ones. This intimates a complex architecture with, as we have pointed out here, a multitude of functions and outcomes.

The symbiotic relationship between cortical and subcortical structures leads to what is best described as an ‘interaction between shared acoustic features and learned representations’ (a phrase coined by Zatorre & Gandour, 2008) taking place at subcortical structures. From an acoustic standpoint, speech and music are highly complex signals with many shared features (e.g. pitch, timbre, and timing). Because the brainstem is the common ascending pathway for both faculties, and furthermore, because the ABR faithfully represents stimulus features, auditory experiences (i.e. learned representations) in one domain can shape the subcortical representation of both domains. (p.278) The connection between acoustics and learning is also evident in examples of selective subcortical enhancement and impairment. For example, in language-based learning disorders, the selective (non-pervasive) pattern of subcortical impairment (i.e. timing and speech formants but not pitch) is very much in accord with the behavioural manifestations of the disorder (Banai et al., 2009). Additionally, for speakers of tonal languages, subcortical enhancements occur for learned (i.e. language-specific) pitch contours (cf. Bidelman et al., 2011). Likewise, in musicians we find that subcortical refinement is linked to specific aspects of music learning (Bidelman & Krishnan, 2009; Lee et al., 2009). Taken together, these findings suggest that that music learning does not produce a simple gain effect.

The interaction between acoustics and learning likely encompasses global mechanisms (attention, memory, etc.) as well as de novo processes occurring within brainstem structures (e.g. statistical learning) (Dean, Robinson, Harper, & McAlpine, 2008). Although our discussion has primarily focused on top-down processes, the role of bottom-up and de novo processes cannot be dismissed (Krishnan & Gandour, 2009). For instance, the segregation of pitch and timing early in the processing stream (Kraus & Nicol, 2005), coupled with the fact that speech perception requires more temporal precision while music requires more spectral precision (Shannon, 2005), suggests that the brainstem could be a precursor for cortical modularization of speech and music. However, the very existence of the corticofugal pathway indicates that this is not an ‘either/or’ situation; a more realistic view is that both permanent and temporary top-down processes interact with bottom-up and local processes to shape subcortical function over a lifetime (Bidelman et al., 2011; Krishnan & Gandour, 2009; Kraus and Chandrasekaran, 2010). This feedback loop could then reinforce the putative domain-specificity at both cortical and subcortical levels.

Although it is beyond the scope of the commentary to provide a fully fleshed-out model, this commentary could serve as a stepping-stone for a unified account of how brainstem function fits into the debate of modularity and resource sharing. With the help of well-controlled longitudinal (e.g. Moreno et al., 2009) and population-based studies, it may be possible to tease apart: (1) whether subcortical structures are intrinsically domain general and inherit aspects of domain specificity from higher-up structures, (2) which aspects of subcortical pitch processing might be unique to music, (3) whether speech-music interference effects take place subcortically, (4) how music- and speech-learning are linked at a subcortical level, (5) how musical training can possibly strengthen and/or ameliorate impaired subcortical representation of speech and (6) whether amusia is strictly a cortically-based musical disorder or whether it has a subcortical locus and/or non-musical manifestations (Hattiangadi et al., 2005; Douglas & Bilkey, 2007; Nguyen, Tillmann, Gosselin, & Peretz, 2009).


This work is supported by the National Science Foundation (NSF 0544846), the National Institutes of Health (R01DC01510) and The Hugh Knowles Center for Clinical and Basic Science in Hearing and Its Disorders at Northwestern University, Evanston IL USA.

(p.279) References

Bibliography references:

Akhoun, I., Gallego, S., Moulin, A., Menard, M., Veuillet, E., Berger-Vachon, C., et al. (2008). The temporal relationship between speech auditory brainstem responses and the acoustic pattern of the phoneme/ba/in normal-hearing adults. Clinical Neurophysiology, 119, 922–33.

Anvari, S. H., Trainor, L. J., Woodside, J., & Levy, B. A. (2002). Relations among musical skills, phonological processing, and early reading ability in preschool children. Journal of Experimental Child Psychology, 83, 111–30.

Banai, K., Hornickel, J., Skoe, E., Nicol, T., Zecker, S. & Kraus, N. (2009). Reading and subcortical auditory function. Cerebral Cortex, 19(11), 2699–707.

Banai, K., & Kraus, N. (2008). The dynamic brainstem: implications for APD. San Diego, CA: Plural Publishing Inc.

Bidelman, G. M., Gandour, J. T., & Krishnan, A. (2011). Cross-domain effects of music and language experience on the representation of pitch in the human auditory brainstem. Journal of Cognitive Neuroscience, 23(2), 425–23.

Bidelman, G. M., & Krishnan, A. (2009). Neural correlates of consonance, dissonance, and the hierarchy of musical pitch in the human brainstem. Journal of Neuroscience, 29, 13165–71.

Bidelman, G. M., & Krishnan, A. (2010). Effects of reverberation on brainstem representation of speech in musicians and non-musicians. Brain Research, 1355, 112–125.

Braun, A., McArdle, J., Jones, J., Nechaev, V., Zalewski, C., Brewer, C., et al. (2008). Tune deafness: processing melodic errors outside of conscious awareness as reflected by components of the auditory ERP. PLoS ONE, 3, e2349.

Carcagno, S., & Plack, C.J. (2011). Pitch discrimination learning: specificity for pitch and harmonic resolvability, and electrophysiological correlates. Journal of the Association for Research in Otolaryngology, 12(4), 503–517.

Chandrasekaran, B., Hornickel, J., Skoe, E., Nicol, T., & Kraus, N. (2009). Context-dependent encoding in the human auditory brainstem relates to hearing speech in noise: implications for developmental dyslexia. Neuron, 64, 311–19.

Chandrasekaran, B., & Kraus, N. (2009). The scalp-recorded brainstem response to speech: neural origins and plasticity. Psychophysiology, 47(2), 236–46.

Chartrand, J. P., Peretz, I., & Belin, P. (2008). Auditory recognition expertise and domain specificity. Brain Research, 1220, 191–98.

Corriveau, K., & Goswami, U. (2009). Rhythmic motor entrainment in children with speech and language impairments: tapping to the beat. Cortex, 45, 119–30.

Cunningham, J., Nicol, T., Zecker, S., & Kraus, N. (2000). Speech-evoked neurophysiologic responses in children with learning problems: development and behavioral correlates of perception. Ear and Hearing, 21, 554–68.

de Boer, J., & Thornton, A. R. (2008). Neural correlates of perceptual learning in the auditory brainstem: efferent activity predicts and reflects improvement at a speech-in-noise discrimination task. Journal of Neuroscience, 28, 4929–37.

Dean, I., Robinson, B. L., Harper, N. S., & McAlpine, D. (2008). Rapid neural adaptation to sound level statistics. Journal of Neuroscience, 28, 6430–38.

Deutsch, D., Henthorn, T., Marvin, E., & Xu, H. (2006). Absolute pitch among American and Chinese conservatory students: prevalence differences, and evidence for a speech-related critical period. Journal of the Acoustical Society of America, 119, 719–22.

Douglas, K. M., & Bilkey, D. K. (2007). Amusia is associated with deficits in spatial processing. Nature Neuroscience, 10, 915–21.

(p.280) Foxton, J. M., Dean, J. L., Gee, R., Peretz, I., & Griffiths, T. D. (2004). Characterization of deficits in pitch perception underlying ‘tone deafness’. Brain, 127, 801–10.

Galbraith, G. C., Arbagey, P. W., Branski, R., Comerci, N., & Rector, P. M. (1995). Intelligible speech encoded in the human brain stem frequency-following response. Neuroreport, 6, 2363–67.

Galbraith, G. C., Bhuta, S. M., Choate, A. K., Kitahara, J. M., & Mullen, T. A., Jr. (1998). Brain stem frequency-following response to dichotic vowels during attention. Neuroreport, 9, 1889–93.

Galbraith, G. C., & Doan, B. Q. (1995). Brainstem frequency-following and behavioral responses during selective attention to pure tone and missing fundamental stimuli. International Journal of Psychophysiology, 19, 203–14.

Galbraith, G. C., Olfman, D. M., & Huffman, T. M. (2003). Selective attention affects human brain stem frequency-following response. Neuroreport, 14, 735–38.

Galbraith, G. C. (2008). Deficient brainstem encoding in autism. Clinical Neurophysiology, 119, 1697–1700.

Hattiangadi, N., Pillion, J. P., Slomine, B., Christensen, J., Trovato, M. K., & Speedie, L. J. (2005). Characteristics of auditory agnosia in a child with severe traumatic brain injury: a case report. Brain and Language, 92, 12–25.

Hornickel, J.M., Skoe, E., & Kraus, N. (2009). Subcortical lateralization of speech encoding. Audiology Neurotology, 14, 198–207.

Hornickel, J., Skoe, E., Nicol, T., Zecker, S., & Kraus, N. (2009b). Subcortical differentiation of stop consonants relates to reading and speech-in-noise perception. Proceedings of the National Academy of Science U S A, 106, 13022–27.

Hyde, K. L., Zatorre, R. J., Griffiths, T. D., Lerch, J. P., & Peretz, I. (2006). Morphometry of the amusic brain: a two-site study. Brain, 129, 2562–70.

King, C., Nicol, T., McGee, T., & Kraus, N. (1999). Thalamic asymmetry is related to acoustic signal complexity. Neuroscience Letters, 267, 89–92.

King, C., Warrier, C. M., Hayes, E., & Kraus, N. (2002). Deficits in auditory brainstem pathway encoding of speech sounds in children with learning problems. Neuroscience Letters, 319, 111–15.

Kral, A., & Eggermont, J. J. (2007). What's to lose and what's to learn: development under auditory deprivation, cochlear implants and limits of cortical plasticity. Brain Research Reviews, 56, 259–69.

Kraus, N., & Nicol, T. (2005). Brainstem origins for cortical ‘what’ and ‘where’ pathways in the auditory system. Trends in Neurosciences, 28, 176–81.

Kraus, N., & Chandrasekaran, B. (2010). Music training for the development of auditory skills. Nature Reviews Neuroscience, 11, 599–605.

Krishnan, A., & Gandour, J. T. (2009). The role of the auditory brainstem in processing linguistically-relevant pitch patterns. Brain and Language, 110, 135–48.

Krishnan, A., Xu, Y., Gandour, J., & Cariani, P. (2005). Encoding of pitch in the human brainstem is sensitive to language experience. Cognitive Brain Research, 25, 161–68.

Lee, K. M., Skoe, E., Kraus, N., & Ashley, R. (2009). Selective subcortical enhancement of musical intervals in musicians. Journal of Neuroscience, 29, 5832–40.

Levine, R. A., Liederman, J., & Riley, P. (1988). The brainstem auditory evoked potential asymmetry is replicable and reliable. Neuropsychologia, 26, 603–14.

Luo, F., Wang, Q., Kashani, A., & Yan, J. (2008). Corticofugal modulation of initial sound processing in the brain. Journal of Neuroscience, 28(45), 11615–21.

Malmierca, M. S., Cristaudo, S., Perez-Gonzalez, D., & Covey, E. (2009). Stimulus-specific adaptation in the inferior colliculus of the anesthetized rat. Journal of Neuroscience, 29, 5483–93.

(p.281) Micheyl, C., Khalfa, S., Perrot, X., & Collet, L. (1997). Difference in cochlear efferent activity between musicians and non-musicians. Neuroreport, 8, 1047–50.

Moore, D. R., Rosenberg, J. F., & Coleman, J. S. (2005). Discrimination training of phonemic contrasts enhances phonological processing in mainstream school children. Brain and Language, 94, 72–85.

Moreno, S., Marques, C., Santos, A., Santos, M., Castro, S. L., & Besson, M. (2009). Musical training influences linguistic abilities in 8-year-old children: More evidence for brain plasticity. Cerebral Cortex, 19(3), 712–23.

Musacchia, G., Sams, M., Nicol, T., & Kraus, N. (2006). Seeing speech affects acoustic information processing in the human brainstem. Experimental Brain Research, 168, 1–10.

Musacchia, G., Sams, M., Skoe, E., & Kraus, N. (2007). Musicians have enhanced subcortical auditory and audiovisual processing of speech and music. Proceedings of the National Academy of Sciences of the USA, 104, 15894–98.

Nguyen, S., Tillmann, B., Gosselin, N., & Peretz, I. (2009). Tonal language processing in congenital amusia. Annals of the New York Academy of Science, 1169, 490–93.

Overy, K., Nicolson, R. I., Fawcett, A. J., & Clarke, E. F. (2003). Dyslexia and music: measuring musical timing skills. Dyslexia, 9, 18–36.

Parbery-Clark, A., Skoe, E., Lam, C., & Kraus, N. (2009). Musician enhancement for speech in noise. Ear and Hearing, 30(6), 653–61.

Parbery-Clark, A., Skoe, E., & Kraus, N. (2009b). Musical experience limits the degradative effects of background noise on the neural processing of sound. Journal of Neuroscience, 29, 14100–07.

Patel, A. (2011). Why would music training benefit the neural encoding of speech? The OPERA hypothesis. Frontiers in Psychology, 2, 1–14.

Peretz, I., Champod, A. S., & Hyde, K. (2003). Varieties of musical disorders. The Montreal Battery of Evaluation of Amusia. Annals of the New York Academy of Science, 999, 58–75.

Perrot, X., Micheyl, C., Khalfa, S., & Collet, L. (1999). Stronger bilateral efferent influences on cochlear biomechanical activity in musicians than in non-musicians. Neuroscience Letters, 262, 167–70.

Perrot, X., Ryvlin, P., Isnard, J., Guenot, M., Catenoix, H., Fischer, C., et al. (2006). Evidence for corticofugal modulation of peripheral auditory activity in humans. Cerebral Cortex, 16, 941–48.

Rinne, T., Balk, M. H., Koistinen, S., Autti, T., Alho, K., & Sams, M. (2008). Auditory selective attention modulates activation of human inferior colliculus. Journal of Neurophysiology, 100, 3323–27.

Russo, N. M., Nicol, T. G., Zecker, S. G., Hayes, E. A., & Kraus, N. (2005). Auditory training improves neural timing in the human brainstem. Behavioural Brain Research, 156, 95–103.

Russo, N. M., Skoe, E., Trommer, B., Nicol, T., Zecker, S., Bradlow, A., et al. (2008). Deficient brainstem encoding of pitch in children with autism spectrum disorders. Clinical Neurophysiology, 119, 1720–31.

Saffran, J.R., & Thiessen, E.D. (2006). Domain-general learning capacities. In E. Hoff and M. Shatz (Eds.), Handbook of language development (pp. 68–86). Cambridge: Blackwell.

Shannon, R. V. (2005). Speech and music have different requirements for spectral resolution. International Review of Neurobiology, 70, 121–34.

Sininger, Y. S., & Cone-Wesson, B. (2004). Asymmetric cochlear processing mimics hemispheric specialization. Science, 305, 1581.

Sininger, Y. S., & Cone-Wesson, B. (2006). Lateral asymmetry in the ABR of neonates: evidence and mechanisms. Hearing Research, 212, 203–11.

(p.282) Skoe, E., & Kraus, N. (2010). Brainstem responses to complex sounds: a tutorial. Ear and Hearing, 31(3), 302–24.

Song, J. H., Banai, K., & Kraus, N. (2008a). Brainstem timing deficits in children with learning impairment may result from corticofugal origins. Audiology & Neuro-otology, 13, 335–44.

Song, J. H., Skoe, E., Wong, P. C., & Kraus, N. (2008b). Plasticity in the adult human auditory brainstem following short-term linguistic training. Journal of Cognitive Neuroscience, 20, 1892–902.

Song, J. H., Skoe, E., Banai, K., & Kraus, N. (2011). Training to Improve Hearing Speech in Noise: Biological Mechanisms. Cereb Cortex, [Epub].

Stegemöller, E.L., Skoe, E., Nicol, T., Warrier, C.M., & Kraus, N. (2008). Musical training and vocal production of speech and song. Music Perception, 25, 419–28.

Stevens, C., Fanning, J., Coch, D., Sanders, L., & Neville, H. (2008). Neural mechanisms of selective auditory attention are enhanced by computerized training: electrophysiological evidence from language-impaired and typically developing children. Brain Research, 1205, 55–69.

Strait, D. L., Kraus, N., Skoe, E., & Ashley, R. (2009). Musical experience and neural efficiency: effects of training on subcortical processing of vocal expressions of emotion. European Journal of Neuroscience, 29, 661–68.

Strait, D. L., & Kraus, N. (in press). Playing Music for a Smarter Ear: Cognitive, Perceptual and Neurobiological Evidence. Music Perception.

Suga, N., Xiao, Z., Ma, X., & Ji, W. (2002). Plasticity and corticofugal modulation for hearing in adult animals. Neuron, 36, 9–18.

Swaminathan, J., Krishnan, A., & Gandour, J. T. (2008). Pitch encoding in speech and nonspeech contexts in the human auditory brainstem. Neuroreport, 19, 1163–67.

Tallal, P., & Gaab, N. (2006). Dynamic auditory processing, musical experience and language development. Trends in Neurosciences, 29, 382–90.

Tallal, P., Miller, S. L., Bedi, G., Byma, G., Wang, X., Nagarajan, S. S., et al. (1996). Language comprehension in language-learning impaired children improved with acoustically modified speech. Science, 271, 81–84.

Wible, B., Nicol, T., & Kraus, N. (2004). Atypical brainstem representation of onset and formant structure of speech sounds in children with language-based learning problems. Biological Psychology, 67, 299–317.

Wible, B., Nicol, T., & Kraus, N. (2005). Correlation between brainstem and cortical auditory processes in normal and language-impaired children. Brain, 128, 417–23.

Winer, J. A. (2006). Decoding the auditory corticofugal systems. Hearing Research, 212, 1–8.

Wong, P. C., Skoe, E., Russo, N. M., Dees, T., & Kraus, N. (2007). Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nature Neuroscience, 10, 420–22.

Xu, Y., Krishnan, A., & Gandour, J. T. (2006). Specificity of experience-dependent pitch representation in the brainstem. Neuroreport, 17, 1601–05.

Zatorre, R. J., & Gandour, J. T. (2008). Neural specializations for speech and pitch: moving beyond the dichotomies. Philosophical Transactions of the Royal Society London B: Biological Sciences, 363, 1087–104.


(1) To achieve our objectives, many details and key methodological concepts relating to our work will need to be glossed over. We invite the reader to visit our laboratory's website for more information:

(2) For an elegant overview of the auditory brainstem and its changing role in the study of human electrophysiology, see Galbraith, 2008.

(3) For a discussion of auditory experts, see Chartrand, Peretz, & Belin, 2008.