Word recognition and sound localization as they relate to spatial hearing
|Tobias Neher, PhD, and Thomas Behrens, MSc, are research engineers at Eriksholm Research Center, Snekkersten, Denmark; Douglas L. Beck, AuD, is the Director of Professional Relations for Oticon Inc, Somerset, NJ.|
As scientists, researchers, and professionals helping patients and subjects with hearing loss, it seems reasonable to think about audition in terms of things we measure in the clinic. These can include otoacoustic emissions, tympanograms, reflexes, auditory brainstem responses, air and bone conduction pure-tone thresholds, speech awareness and speech reception thresholds, word recognition scores, and more.
While all of these are useful and meaningful metrics, we unfortunately tend to neglect aspects of sound impacting acoustically defined spatial information. Of course, we don’t neglect spatial aspects of sound maliciously; we have few useful or clinically based tools to measure spatial perception and response to spatial cues. The few tools we have at this time are essentially lab-based and not quite “ready for prime time.”1
Nonetheless, beyond its obvious importance with regard to sound localization, spatial hearing plays a crucial role in word recognition, too. This article will review important aspects of spatial hearing relating to sound localization and word recognition.
How Do We Hear in 3D?
A discussion about interaural time differences (ITDs), interaural level differences (ILDs), and spectral peaks and notches and how these elements influence speech understanding in difficult listening environments. Advanced amplification is now being developed that allows acoustically initiated spatial hearing cues to be preserved even more and delivered through hearing aids.
Spatial information is conveyed by subtle acoustic cues that can indicate the origin of sound with regard to three-dimensional space. Therefore, spatial perception contributes to a listener’s ability to navigate her physical surroundings. In addition, in situations where there are multiple competing signals, such as at a cocktail party, spatial information effectively allows people to “zoom in” on a particular signal of interest. This has been demonstrated in a multitude of studies.
Kidd et al2 presented a target speech signal from one of three loudspeakers placed horizontally in front of their subjects. From the other two loudspeakers, very similar competing speech signals were presented. Under these challenging conditions, knowing the spatial location of the target signal strongly aided the subjects’ ability to recall it.
Schneider et al3 investigated the importance of spatially separating a speech target from a speech masker with respect to spatially based release from masking. In accordance with Kidd et al,2 they found that a priori knowledge of the target speaker’s spatial location contributed to the perceptual attenuation of the masker signal.
Cameron et al4 described the Listening in Spatial Noise (LISN) test as a promising tool for the assessment of auditory processing disorders (APD) in children. They concluded, “… of those children with APD, there may be a high proportion who have deficits in the binaural processing mechanisms that normally use the spatial distribution of sources to suppress unwanted signals.” In other words, they argued that normally perceived spatial cues help suppress secondary signals such as background noise.
Therefore, research has shown spatial information and resultant spatial hearing are of paramount importance not only for sound localization, but for speech recognition in challenging acoustic environments with multiple competing signals.
Spatial Hearing Cues
There are three primary, acoustically initiated spatial hearing cues5:
- Interaural time differences (ITDs),
- Interaural level differences (ILDs), and
- Spectral peaks and notches.
These spatial hearing cues differ with regard to the frequency region in which they are effective (eg, above or below 1500 Hz) and which spatial dimensions (left-right, front-back, or up-down) they can provide information about.
Interaural Time Differences. Sounds originating directly in front of (0° azimuth), or directly behind (180° azimuth) a given listener have a theoretical ITD of 0 msecs. Nonetheless, sounds originating on the listener’s side give rise to significant ITDs, because they arrive earlier at the ear nearest the sound source. For signals below about 1500 Hz, timing differences between ears provide effective left-right spatial information.
Interaural Level Differences. ILDs arise due to head shadow effects. Head shadowing attenuates the high-frequency content of a sound at the ear farther away from a lateral sound source, which gives rise to effective left-right spatial information above about 1500 Hz. Similar to ITDs, sounds that originate directly in front of (or directly behind) the listener have a theoretical ILD of 0 dB. However, sounds originating on the listener’s side can give rise to level differences between ears of 20 dB or more for frequencies higher than about 6000 Hz.
Spectral Peaks and Notches. Spectral peaks and notches can provide information about front-back and up-down orientations. Spectral peaks and notches arise due to anatomic and acoustic properties of the human outer ear, which amplifies and attenuates frequencies higher than about 4000 Hz in a direction-dependent manner.
Listening in Complex Environments
Contributions from each of the three types of acoustically initiated spatial hearing cues to spatial hearing have been well documented across a multitude of studies. In general, previous studies have required listeners to localize a single sound.5 Nonetheless, as indicated above, the real and often underappreciated benefit of spatial hearing occurs in situations involving competing sounds, such as cocktail parties where the listener’s task is formidable relative to understanding what is being said.
Arguably, the most difficult listening conditions arise when both target and competing sounds are speech. It is in this situation that spatial hearing makes significant contributions to speech recognition.6
One perceptual mechanism that may play a role in such situations is called “better-ear effects.” Better-ear effects are related to head shadow and can lead to effective signal-to-noise ratio (SNR) improvements of up to 8 dB.7 For example, with a target sound on one side of the listener and an interfering sound on the other side, head shadowing attenuates the high-frequency content of the interfering sound on the side of the target sound. This then allows the listener to better detect and recognize the target sound.
Another perceptual mechanism that may play an even larger role is “spatial focus of attention.”2 Spatial focus of attention refers to when a listener is able to perceptually separate a target from competing sound based on spatial hearing cues. In this situation, the target can be enhanced by selectively (cognitively) attending to it. In other words, cognitive resources (ie, “top down” as discussed by Schum and Beck8) are being drawn upon to make the target sound more intelligible. This effect can be thought of as a “mental spotlight” emphasizing the signal within the spotlight, while simultaneously de-emphasizing everything beyond the spotlight beam.
Research at Eriksholm (Denmark) has shown that, by selectively attending to a speech target that is spatially separated along the left-right dimension from two speech maskers, normal-hearing subjects gain about 15 dB effective SNR compared to a condition without left-right separation of the three speech signals and without the possibility for spatially focusing attention.9
Effects of Hearing Loss and Amplification
Spatial hearing abilities of hearing-impaired listeners are poorer than spatial hearing abilities of normal listeners. With respect to sound localization, high-frequency hearing loss seems to primarily degrade front-back and up-down localization ability, whereas low-frequency hearing loss seems to primarily degrade left-right localization ability.10 In general, as hearing loss increases, localization performance decreases.
Since amplification improves audibility, it can sometimes improve localization ability.11 Nevertheless, hearing aids cannot compensate for distortions and abnormalities such as decreased frequency or temporal resolution that typically accompany a sensorineural hearing loss (SNHL). These distortions and abnormalities may prevent access to spatial hearing cues,12 which then results in worse spatial hearing abilities.
As of 2008, few studies have examined the spatial hearing abilities of hearing aid users in situations with multiple competing talkers. Research conducted at Eriksholm has shown that, while some hearing aid users can come close to normal-hearing performance on the spatial focus of attention task described above, most cannot.13 Indeed, the ability to utilize spatial focus of attention seems to be limited by both hearing loss and aging.14 Our ongoing research is therefore concerned with investigating how different hearing aid features influence spatial hearing performance and how these features interact with hearing loss, age, and other factors.
Another goal of our research is to develop test procedures that can better mimic realistic listening situations. This is motivated by the finding that our current lab-based measurements—which generally suggest hearing aid users obtain only limited spatial hearing benefit—do not seem to tell the entire story. For example, Hansen15 recently reported results from 58 elderly, experienced bilateral hearing aid users with mild-to-moderate SNHL. Each subject compared their own advanced technology hearing aids to the Oticon Epoq, which was designed to better preserve spatial hearing cues.1 Measurements of speech intelligibility, sound quality, and spatial perception were made using the Dantale II test and the Speech, Spatial and Qualities (SSQ) questionnaire. Hansen reported that Epoq provided not only more natural sound quality and improved intelligibility in noise, but also better spatial hearing in difficult acoustic environments. In particular, she noted Epoq facilitated “spatial perception and segregation of sounds.”
Thus it could be that, to be able to demonstrate the potential real-world advantages of advanced hearing aid technology in the lab, the test methods used need to be more nuanced for the benefits to show up.
In this brief article, we’ve discussed a multitude of concepts and findings surrounding the important topic of spatial hearing. In addition to the obvious benefits of better identifying the origin of sounds in space, we’ve witnessed early data indicating the benefits of enhanced spatial perception through amplification15 and explored the possibility for further speech recognition enhancement based on spatial information.
Although some of the concepts presented are new and some are arguably difficult to grasp, we look forward to advanced amplification technologies that allow acoustically initiated spatial hearing cues to be preserved even more and delivered through advanced technology hearing aids.
Ideal amplification will maximize access to and facilitate accurate representation of naturally occurring auditory cues to supply the brain with the acoustic information needed for managing complex listening situations, consistent with maximal use of our own natural resources8 and consistent with maximal preservation and delivery of auditory cues.
This paper is an expansion of a previous article in the 2008 Oticon Clinical Update. Clinical note: In addition to the Epoq XW binaural processing (spatial sound) system, Oticon introduced this month the Dual Connect XW, with binaural processing. Both products are streamer compatible.
- Behrens T. Spatial hearing in complex sound environments: clinical data. Hearing Review. 2008;15(3):94-102.
- Kidd G Jr, Arbogast TL, Mason CR, Gallun FJ. The advantage of knowing where to listen. J Acoust Soc Am. 2005;118:3804-3815.
- Schneider BA, Li L, Daneman M. How competing speech interferes with speech comprehension in everyday listening situations. J Am Acad Audiol. 2007;18:559-572.
- Cameron S, Dillon H, Newall P. The listening in spatialized noise test: an auditory processing disorder study. J Am Acad Audiol. 2006;17:306-320.
- Blauert J. Spatial Hearing—The Psychophysics of Human Sound Localization. Cambridge, Mass: The MIT Press; 1983.
- Bronkhorst AW. The cocktail party phenomenon: a review of research on speech intelligibility in multiple-talker conditions. Acta Acust Acust. 2000;86:117-128.
- Bronkhorst AW, Plomp R. The effect of head-induced interaural time and level differences on speech intelligibility in noise. J Acoust Soc Am. 1988;83:1508-1516.
- Schum DJ, Beck DL. Negative synergy—hearing loss and aging, 2008. www.audiologyonline.com/articles/article_detail.asp?article_id=2045.
- Behrens T, Neher T, Johannesson RB. Evaluation of speech corpus for assessment of spatial release from masking. In: Dau T, et al, eds. Auditory Signal Processing in Hearing-Impaired Listeners. Copenhagen, Denmark: Centertryk A/S; 2008:449-457.
- Noble W, Byrne D, Lepage B. Effects on sound localization of configuration and type of hearing impairment. J Acoust Soc Am. 1994;95:992-1005.
- Byrne D, Noble W. Optimizing sound localization with hearing aids. Trends Amplif. 1998;2:51-73.
- Moore BCJ. Cochlear Hearing Loss. London: Whurr Publishers Ltd; 1998.
- Neher T, Behrens T, Kragelund L, Petersen AS. Spatial unmasking in aided hearing-impaired listeners and the need for training. In: Dau T, et al, eds. Auditory Signal Processing in Hearing-Impaired Listeners. Copenhagen, Denmark: Centertryk A/S; 2008:515-522.
- Neher T, Behrens T. Relations between hearing loss and cognitive abilities and spatial release from speech-on-speech masking in aided hearing-impaired listeners. Presented at: International Hearing Aid Research Conference; August 13-17, 2008; Lake Tahoe, Calif.
- Hansen LB. Epoq study measures user benefit. Hear Jour. 2008;61:47-49.
Correspondence can be addressed to HR at or Tobias Neher at .
Citation for this article:
Neher T, Behrens T, Beck DL. Spatial hearing and understanding speech in complex environments. The Hearing Review. 2008;15(12):22-25.