Applying basic science regarding temporal processing to clinical services involving hearing instrument dispensing.
|
Editor’s Note: In the May and June editions of HR (accessible at www.hearingreview.com), Rawool describes the effects of hearing loss on the temporal processing of auditory stimuli. In this article, amplification strategies for addressing temporal processing deficits are reviewed.
Addressing Deficits in Temporal Resolution
Gap detection. Individuals with hearing loss can detect gaps when sounds that do not have many temporal fluctuations are presented at higher levels—as when presented through amplification. In addition, use of fast compression can improve the ability of listeners with hearing loss to detect gaps in sounds with slowly fluctuating envelopes.1
Temporal modulation detection. High frequency hearing loss can reduce the sensitivity to high rates of temporal modulation in an auditory signal, probably because of inaudible high frequencies. Audibility in the high frequencies is now a realistic goal due to the availability of hearing aids with wider frequency spectrum and active feedback cancellation techniques.
Many consonants have rapid intensity changes that must be perceived correctly for accurate recognition of those consonants. Such changes can be enhanced by automatically increasing gain during rapid changes in the input signal and decreasing the gain when the input signal is relatively constant (eg, vowels).2
Duration discrimination. Individuals with hearing loss benefit from clear speech; a consistent characteristic of clear speech is increased pauses between words and increased duration of sounds.3 This finding suggests that speech intelligibility may improve with increased duration of sounds.
Montgomery and Edge4increased the duration of consonants by 30 ms while shortening the vowels to maintain original overall duration. The increased consonant-duration did not provide any benefit to listeners with moderate sensorineural hearing loss at 65 dB SPL; however, it did provide a small (5%) but significant improvement in intelligibility at 95 dB SPL.
The differentiation between unvoiced and voiced consonants partially depends on the duration of the vowel preceding the consonant; the duration is longer for voiced consonants than that for unvoiced consonants. This cue is particularly important for some individuals with hearing loss.5
Perception of consonant voicing tends to improve if the vowel-duration cues are made more obvious by lengthening the vowels before voiced fricatives and shortening them before voiceless fricatives.6Application of such durational changes requires knowledge about the voicing characteristics of the consonant prior to the occurrence of the vowel. Automatic speech recognition algorithms may be helpful in this context to predict the occurrence of voiced or voiceless consonants.
Addressing Deficits in Temporal Masking
The deficits in forward masking apparent in individuals with hearing loss increases the potential for masking of weaker phonemes that occur after stronger phonemes. Fast-acting compressors that can quickly reduce the intensity of strong phonemes and increase the intensity of weaker phonemes is a possible solution. This type of compression is helpful when the listener has poor speech recognition for uncompressed speech and the background noise is modulated.7
Addressing Deficits in Temporal Integration
At higher sound pressure levels, listeners with hearing loss tend to show normal temporal integration.8Thus, presentation of signals at higher sound pressure levels through increased gain can be expected to lead to normal temporal integration.
Addressing Deficits in Temporally Degraded Speech
Fast speech. Older listeners with hearing loss appear to benefit when the original speech signal is temporally extended by 1.4 times (eg, reducing the speech rate).9Such an approach may be beneficial in those cases where the communication disability is very severe.
Another way to slow speech rate is to lengthen vowels and transitions. Some of the gaps between speech sounds are reduced with this particular approach in order to allow the output from the hearing aid to keep up with the input. This approach can increase speech intelligibility for some individuals with hearing loss.10,11
Reverberant speech. The benefit of directional microphones is reduced in reverberant environments12 when the distance between the listener and speaker increases beyond a critical distance. Therefore, some hearing aids have optional circuits that detect and suppress the reflections that occur after the offset of the original sound source. The technology improves listening comfort and is preferred in the presence of reverberation. Although speech recognition scores do not improve significantly with this technology,13acclimatization with the technology may improve speech recognition in reverberant environments.
Kollmeier et al14assessed an algorithm for suppressing lateral noise sources and reverberation. The algorithm was based on averaging interaural time and intensity differences to detect lateral incident sound components and further evaluating the interaural coherence to detect reverberation. Frequency bands showing values of the interaural time and intensity differences close to the desired reference values and interaural coherence close to 1.00 were passed through unchanged, whereas frequency bands with undesired values were attenuated.
The suppression of reverberation based on interaural coherence values improves speech intelligibility slightly, but additional filtering of sounds with lateral incident results in a substantial improvement in speech intelligibility at high and intermediate signal-to-noise ratios (SNR). For low SNRs, the artifacts increase and no benefit is apparent compared to unprocessed speech. It appears that application of this algorithm, along with automatic switching (on/off) capabilities depending on the noise background, may improve speech recognition in moderately adverse environments for some listeners.
Temporal Processing and Speech in Competing Background Noise
Automatic speech recognition. By using an algorithm that automatically recognizes speech—using phonetic, linguistic, and statistical information, and acoustic properties of speech—a hearing aid can be programmed to resynthesize speech and deliver a clean speech signal without noise to the listener’s ear. Automatic speech recognition algorithms that are relatively robust to noise are currently available.15 Such algorithms need to be quick enough (processing time less than 40 ms) to allow sufficient auditory-visual synchronization of the speech signal.16
Directional microphones. A significant benefit in recognizing speech in noise can be obtained in laboratory settings with the use of directional microphones and beamforming arrays, especially when the speech and noise sources arrive from different locations. The magnitude of the benefit is approximately 10% to 30% in realistic environments, but the benefit may not be perceived by all listeners.17
For maximum benefit from directional microphones, the listeners need to face the signals of interest and need to be closer to the signal of interest, which may not be practical in all listening situations. In addition, some older listeners may not remember to face the signal of interest. Thus, it is not surprising that success with the use of directional microphones in everyday situations cannot be reliably predicted from the magnitude of directional advantage obtained in clinical test settings.18
Spectral enhancement. The perceived frequency of voicing is partly determined by the time pattern at the outputs of the auditory filters tuned close to the formant frequency. Background noise can obscure this time pattern, which may lead to less accuracy in recognizing these frequencies. If temporal patterns are disturbed by greater noise passing through an auditory filter broadened by hearing loss, then enhancing those portions of the spectrum where the SNR is highest (the peaks) and suppressing those portions where it is lowest (the valleys) should help.
Such spectrally enhanced speech appears to stand out more clearly against the background noise for normal listeners. For individuals with hearing loss, enhancement of major spectral prominences can decrease the response time significantly, suggesting reduced listening effort. It can also improve the SNR by about 0.8 dB.20
Digital noise reduction. Modern hearing aids try to detect the presence of noise or speech by detecting the modulations or synchrony in the temporal waveform. Digital noise reduction may not necessarily improve speech recognition since speech tends to be broadband, and when gain is reduced in a particular frequency band, speech energy in that band is also reduced. However, some individuals may perceive improved sound quality and may prefer the digital noise reduction for a more comfortable listening experience.
Modulation detection based noise reduction. In modulation detection based algorithms, the ongoing modulations of the input signal in each frequency band are evaluated. Speech in quiet has 15 dB or greater modulation depth and a modulation rate of approximately 3 Hz to 10 Hz, while many background sounds do not change so much over time.
If a frequency band appears to have fewer modulations than speech, then gain reduction is applied to that band. Overall, for listening comfort, most listeners prefer less gain in frequency bands where noise is more intrusive.21
The modulation detection based noise reduction can help individuals whose tolerance for background noise is relatively low. The decision rules for reducing gain can be based on variable factors such as the degree of modulation depth, overall level of the modulated signal, modulation rate or some other criteria. The reduction of gain applied to different frequency bands or the amount of gain reduction can also vary. For example, less gain reduction can be employed to those frequency bands that contribute most to speech intelligibility, and the applied gain reduction can vary from 5 dB to 16 dB.22 If modulation detection based noise reduction is combined with adaptive Wiener filtering, it can improve the quality of speech in slightly modulated noise.23
Synchrony detection based noise reduction. The pattern of energy in different frequency regions in speech is precisely timed with the periodic action of the vocal folds or the fundamental frequency of the speaker’s voice. The synchrony detector algorithm searches for this synchronous pattern of energy in the higher frequencies by examining the ongoing correlations of instantaneous amplitudes across frequency regions. Whenever speech is detected, the system provides full amplification and compression characteristics as prescribed. When it detects a lack of synchrony, greater compression is applied to reduce the loudness of background noise.24
Such algorithms work well for individuals whose tolerance for background noise is high and who are motivated to hear and understand speech in noisy surroundings. Generally, if the goal is to have optimum speech intelligibility in noise, more gain is preferred by listeners.21 Listeners with hearing loss appear to be able to use relatively gross temporal asynchrony cues, which should aid in separating speech from background noise as long as both are audible.25
Reverberant speech may not have high synchrony due to better absorption of high frequency sounds. The ability of synchrony detection algorithms in detecting reverberant speech needs to be evaluated.
Binaural Temporal Processing
Binaural temporal processing requires the processing of stimuli over time by both ears. In listeners with normal hearing, stimuli presented to two ears are compared at some central location in the auditory system. For listeners with hearing loss, digital technology can allow communication between the left and right hearing aids and thus compare the input to the two ears for interaural intensity and time differences. Frequency bands showing values of the interaural time and intensity differences close to the desired reference values can be passed through unchanged, whereas frequency bands with undesired values can be attenuated thus providing a better signal to noise ratio for signals coming from the front of the listener.14
Accurate sound localization appears possible for steady-state and continuous signals with the implementation of new algorithms that incorporate spatial image re-synthesis into array signal processing. In such algorithms, super-directive beamformers are used to estimate the direction of sound. The beam is then steered in the direction of the sound. The spatial sound image is then reconstructed by filtering the array output with corresponding head-related transfer functions.26
The Impact of Compression on Temporal Processing
Effects of compression on the temporal waveform. Depending on the attack and release times and the type and degree of compression employed, compression can have the following effects on the temporal waveforms:
1. Decrease in the modulation depth.
2. Distortion in the temporal envelope (eg, overshoots or undershoots). The overshoots can reduce word recognition performance. In some cases, clipping of the overshoots can improve performance.27
3. Changes (reduction or increase in amplitude) in the fine structure of the temporal waveform, which are not present in the original signal. More specifically, the modulation depth decreases over the duration of the attack time and the modulation depth of the weaker components increases over the duration of the release time. Such changes may have an effect only when the attack and release times are relatively long.
The Impact of Compression on Speech Recognition
Improvement in speech recognition. When the noise has temporal fluctuations, there are more opportunities for the signal during the dips of noise to be speech, and thus speech perception can be enhanced. The benefit of fast WDRC may be obvious in temporally fluctuating noises.28 This is especially true if the individual has poor speech discrimination scores.7 Generally, individuals with sloping audiograms, reduced dynamic range, different dynamic ranges across frequency bands, and varied auditory lifestyles are likely to benefit from compression.29
Decrease in speech recognition. When the hearing loss is severe or profound, the dynamic range is limited. Such narrow dynamic range requires the use of high compression ratios. Use of high-compression ratios along with fast-acting compression can adversely affect temporal processing by reducing relative intensity cues and by introducing distortion to the temporal envelope. The temporal characteristics of speech appear to carry important cues for listeners with severe and profound losses, and alteration of the temporal waveform due to compression may lead to poorer performance than that can be obtained with linear amplification.30
Increased temporal-envelope distortion of the compressed signal as apparent in acoustic analysis is associated with reduced recognition of some individual phonemes.31 Also, when individuals have generally good discrimination scores, compression tends to decrease the score.7 In addition, linear fittings appear to be better for individuals with flatter audiograms, wider dynamic ranges, similar dynamic ranges across various frequency bands and restricted auditory lifestyles.29
Application of Conservative Gain Changes
Although, on average, fast compression is considered optimum for speech intelligibility, for a significant number of listeners the best choice may be a slow-acting automatic volume control fitting or a linear fitting.32For many listeners, it is beneficial to reduce and increase the gain in the hearing aids only when necessary.
Many current instruments strive to achieve this goal in a variety of ways. For example, the Adaptive Dynamic Range Optimization (ADRO) algorithm33determines the need for gain adjustments based on a statistical sampling of signal levels. The algorithm monitors a distribution of amplified sound levels in different frequency bands and compares this distribution against a desired output range in that band depending on the dynamic range of the listener.
The lower boundary of the desired output range is based on soft but audible speech, and the higher boundary is based on comfort criterion. The rules specify that 90% of all output levels must fall below the comfort criterion and no more than 30% of the output levels can fall below the audibility criterion. Stated differently, when more than 10% of output levels exceed comfort, gain is reduced, and when more than 30% of the levels fall below the audibility criterion, gain is increased.
The rate of gain adjustment can be 3 dB/sec or 6 dB/sec depending on the preference of the listeners. These slow gain adjustments are expected to minimize the distortions in the temporal waveform. The algorithm relies on the maximum output levels to prevent sudden loud discomforting sounds. A potential advantage of the application of such an algorithm is that the hearing aid functions as a linear amplifier unless either the audibility or comfort criterion is not met.33
Conclusions
Several amplification strategies are available that can address a variety of deficits in temporal processing. These need to be applied only when a listener has poor aided speech recognition in quiet and/or noisy backgrounds. For example, when speech discrimination scores for unprocessed speech are good, fast syllabic compression that can reduce the potential effects of forward masking tends to have a deleterious effect on speech discrimination.7
Substantial variability in the benefit derived by listeners from the use of various strategies has been noted in the literature. This is probably because the type and degree of temporal processing deficits varies between patients. Application of appropriate strategies based on the specific deficits experienced by a particular listener is likely to yield more benefit.
References
1. Moore BC, Glasberg BR, Alcantara JI, Lauber S, Kuehnel V. Effects of slow-and fast-acting compression on the detection of gaps in narrow bands of noise. Brit J Audiol. 2001;35(6):365-374.
2. Mekata T, Yoshizumi Y, Kato Y, Noguchi E, Yamada Y. Development of a portable multi-function digital hearing aid. Seminar at: Intl Conf Spoken Lang Processing. Japan;1994
3. Picheny MA, Durlach NI, Braida LD. Speaking clearly for the hard of hearing. II: Acoustic characteristics of clear and conversational speech. J Speech Hear Res. 1986;29(4)[Dec]:434-446.
4. Montgomery A, Edge R. Evaluation of two speech enhancement techniques to improve intelligibility for hearing-impaired adults. J Speech Hear Res. 1988:31(3):386-393.
5. Revoile S, Holden-Pitt L, Pickett J. (1985). Perceptual cues to the voiced-voiceless distinction of final fricatives for listeners with impaired or normal hearing. J Acoust Soc Am. 77(3): 1263-1265.
6. Revoile S, Holden-Pitt L, Edward D, Pickett J. Some rehabilitative considerations for future speech-processing hearing aids. J Rehab Res Dev. 1986;23(1):89-94.
7. Verschuure H, Benning FJ, Van Cappellen M, Dreschler WA, Boeremans PP. Speech intelligibility in noise with fast compression hearing aids. Audiology. 1998;37(3):127-150.
8. Buus S, Florentine M, Poulsen T. Temporal integration of loudness in listeners with hearing losses of primarily cochlear origin. J Acoust Soc Am. 1999;105(6):3464-3480.
9. Nakamura A, Seiyama N, Ikezawa R, Takagi T, Miyasaka E. Real time speech rate converting system for elderly people. Found in: Proceedings of the IEEE Intl Conference on Acoustic, Speech, and Signal Processing (ICASSP); Adelaide, Australia; 1994;II:225-228.
10. Nejme Y, Artisuka T, Ifukube T, Matsushima J. A portable digital speech-rate converter for hearing impairment. IEEE Trans Rehab Eng. 1996;4:73-83.
11. Nejme Y, Moore B. Evaluation of the effect of speech-rate slowing on speech intelligibility in noise using a simulation of cochlear hearing loss. J Acoust Soc Am. 1998;103(1):572-576.
12. Hoffman M, Trine T, Buckley K, Van Tasell D. Robust adaptive microphone array processing for hearing aids: Realistic speech enhancement. J Acoust Soc Am. 1994;96(2): 759-770.
13. Fabry DA, Tchorz J. A hearing system that can bound back from reverberation. The Hearing Review. 2005;12(10):48,50.
14. Kollmeier B, Peissig J, Hohmann V. Real-time multiband dynamic compression and noise reduction for binaural hearing aids. J Rehabil Res Dev. 1993;30(1):82-94.
15. Cui X, Alwan A. Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR. IEEE Transaction of Speech and Audio Processing. 2005;13(6):1161-1172.
16. McGrath M, Summerfield Q. Intermodal timing relations and audio-visual speech recognition by normal-hearing adults. J Acoust Soc Am. 1985;77(2):678-685.
17. Ricketts TA. Directional hearing aids: Then and now. J Rehab Res Dev. 2005;Suppl 2, 42(4):133-144.
18. Cord MT, Surr RK, Walden BE, Dyrlund O. Relationship between laboratory measures of directional advantage and everyday success with directional microphone hearing aids. J Am Acad Audiol. 2004;15(5):353-364.
19. Leek MR, Dorman MF, Summerfield AQ. Minimum spectral contrast for vowel identification by normal-hearing and hearing-impaired listeners. J Acoust Soc Am. 1987;81:148-154.
20. Baer T, Moore BCJ, Gatehouse S. Spectral contrast enhancement of speech in noise for listeners with sensorineural hearing impairment: Effect on intelligibility, quality and response times. J Rehab Res Dev. 1993;30:95-109.
21. Keidser G, Brew C, Brewer S, Dillon H, Grant F, Storey L. The preferred response slopes and two-channel compression ratios in twenty listening conditions by hearing impaired and normal-hearing listeners and their relationship to the acoustic input. Intl J Audiol. 2005;44(11):656-670.
22. Mueller HG, Ricketts TA. Digital noise reduction: Much ado about something? Hear Jour. 2005;58(1):10-17.
23. Ricketts TA, Hornsby BWY. Sound quality measures for speech in noise through a commercial hearing aid implementing digital noise reduction. J Am Acad Audiol. 2005;16(5):270-277.
24. Schum DJ. Noise reduction circuitry in hearing aids: Goals and current strategies. Hear Jour. 2003;56(6):32-40.
25. Grose JH, Hall JW III. Cochlear hearing loss and the processing of modulation: Effects of temporal asynchrony. J Acoust Soc Am. 1996;100(1):519-527.
26. Bai MR, Lin C. Microphone array signal processing with application of three-dimensional spatial hearing. J Acoust Soc Am. 2005;117(4pt1): 2112-2121.
27. Nabelek IV. Performance of hearing-impaired listeners under various types of amplitude compression. J Acoust Soc Am. 1983;74(3): 776-791.
28. Gatehouse S, Naylor G, Elberling C. Benefits from hearing aids in relation to the interaction between the user and the environment. Intl J Audiol. 2003;42:S77-S85.
29. Gatehouse S, Naylor G, Elberling C. Linear and nonlinear hearing aid fittings 2. Patterns of candidature. Intl Jour Audiol. 2006;45(3):153-171.
30. Souza PE, Jenstad LM, Folino R. Using multichannel wide-dynamic range compression in severely hearing-impaired listeners: effects of speech recognition and quality. Ear Hear. 2005;26(2): 120-131.
31. Jenstad LM, Souza PE. Quantifying the effect of compression hearing aid release time on speech acoustics and intelligibility. J Speech Lang Hear Res. 2005;48(3):651-667.
32. Gatehouse S, Naylor G, Elberling C. Linear and nonlinear hearing aid fittings 1. Patterns of benefit. Intl J Audiol. 2006;45(3): 130-152.
33. Blamey PJ, Martin LFA, Fiket HJ. A digital processing strategy to optimize hearing aid outputs directly. J Am Acad Audiol. 2004;15(10):716-728.