SoundLoc_1_2012
Transcription
SoundLoc_1_2012
10/11/12 Sound localization psychophysics Eric Young A good reference: B.C.J. Moore An Introduction to the Psychology of Hearing Chapter 7, Space Perception. Elsevier, Amsterdam, pp. 233-267 (2004). Sound localization: what is it good for? 1 10/11/12 Where’s the bird? What are the cues for sound localization? There are two binaural cues: ITD: interaural differences in time of arrival of the sound ILD: interaural differences in the loudness (sound level) of the sound Note that ITD can be separated into two cues: 1) the ITD of the envelope of the sound (ITD above) and 2) the ITD of the details of the waveform (the fine structure, IPD at right) ITD and IPD are numerically approximately equal. However IPD cues are much stronger perceptually. 2 10/11/12 Interaural time differences are primarily a cue for azimuth and can be predicted approximately from a simple geometric head-shadow model. data model Woodworth, 1962 There is a small dependence of ITD on frequency human cat (ms) X! Dashed lines are predictions of the Woodworth model (?? For cat) (Hz) Kuhn, 1977 and Roth et al. 1980 3 10/11/12 ITDs are primarily a cue for azimuth in that they vary little with elevation. Middlebrooks and Green, 1990 ILDs are also mainly a cue for azimuth, although at higher frequencies, they provide additional information. Middlebrooks and Green, 1990 4 10/11/12 How are different cues integrated? ITDs are used at low frequencies and ILDs at high frequencies. Blue curves show the cues measured at the ears for a speaker at the MAA. Green curves show the minimum detectable ITD and ILD, based on headphone experiments. ITD matches ILD matches spectral cues here? IPD! Mills, 1972 What is the interaction of cues when they don’t correspond? Are ITDs and ILDs perceptually equivalent? It is possible for subjects to adjust the ITD of a sound (ordinate) so as to center the image of a sound presented with a certain ILD (abscissa)? Note ITDs are more effective at low frequencies Harris, 1960 5 10/11/12 But that doesn't mean that ITD and ILD produce equivalent percepts. Here subjects discriminated a 500 Hz tone with 0 ITD from a similar tone with one of five ITDs as the ILD of the second tone varied. The discriminability (d') is plotted versus the ILD. Perfect trading would have resulted in d' minima near 0. Note that as the ITD increases, the "best" ILD (the minimum of the curve) gives a stimulus with higher and higher discriminability. So ITD and ILD are not perceptually equivalent, despite the trading ratio experiment. X! Hafter and Carrier, 1972 What are the cues for elevation and front/back? Because the head is approximately symmetrical, locations along a cone of confusion all produce roughly the same binaural cues ITD and ILD. Thus ITD and ILD provide little information about elevation and there is a confusion about front vs back. 6 10/11/12 The ambiguity is resolved by spectral cues produced by the external ear. The amplitude of sound at the eardrum is modified by reflections (interference patterns) in the pinna. The pattern of modification, plotted below, varies with the direction of the sound source 30˚ 0˚ -15˚ Two sound paths through the pinna Shaw The notch in HRTFs can be predicted by a parabolic reflector model, in which the reflector represents the posterior wall of the concha. Directional gains at various frequencies Transfer functions (HRTFs) are simpler than in real ears, but capture the general featurs of the notches. X! 7 10/11/12 Evidence that pinna acoustics are important for location in elevation: occluding the cavities of the pinnae decreases performance in a sound source elevation task Gardner and Gardner, 1973 In order to use spectral cues accurately, the stimuli must be broadband. With narrowband stimuli (1/6 octave noiseband), the percept of elevation depends on the frequency content of the stimulus, and not its source direction. 6 kHz noise sounds like above and 8 kHz noise sounds like below in this subject. Middlebrooks, 1992 8 10/11/12 The place pointed to seems to correspond to peaks in the HRTF. The auditory system does its best with inadequate information. 12 kHz az, el 10 kHz 8 kHz HRTFs at the places pointed to for narrowband stimuli centered at 6, 8, 10, and 12 kHz (the stars). Fitting HRTF to spectrum: 12 kHz sound presented from -40, +40 is localized at -145,-17, where the HRTF better matches the spectrum. actual position 6 kHz The subject's response can be predicted from the spatial correlation of the stimulus spectrum and the HRTF. Contours of correlation are shown. subject's response X! Middlebrooks, 1992 Cue trading revisited: broadband noise is presented over headphones that simulate virtual-space by incorporating HRTFs. The result is good localization with all cues present. However, when ITD cues are set to 0 or to -45˚ or 90˚ (??), it is clear that the ITD cues dominate the others. That is, the localization in azimuth follows the ITD cues and the elevation performance is degraded. This result occurs only if the stimulus contains lowfrequency energy. X! Wightman and Wightman 9 10/11/12 How might localization be represented in the brain? The P(τ) model. Assume a neural display with ITD on one axis and BF (or CF) on the other. For an 0.5 kHz tone with an ITD of 0.5 ms, there are extra peaks at one period of 0.5 kHz (±2 ms). These are eliminated with a centering function, e.g. the number of neurons with each ITD. Giving a good reading of the actual ITD. Stern, Bernstein, and Trahiotis 0.5 kHz, 0.5 ms A challenge to the model: broadband noise (actually 500 Hz BW centered at 500 Hz) with ITD = -1.5 ms. The P(τ) function gives ambiguous cues that vary across frequency. noise, -1.5 ms Subject’s localization (ignore curves other than filled squares) Narrow band noise gives the same answer as a tone, due to the centering function. But, at full bandwidth, the noise is perceived as having a negative ITD . . . Stern, Bernstein, and Trahiotis 10 10/11/12 noise, -1.5 ms . . . . even though centered weighting doesn't give the right answer in that case. Stern, Bernstein, and Trahiotis The answer is a second level of cross-BF coincidence analysis, straightness weighting (black lines and dots). This amplifies P(τ) where the ITD cue is the same across frequency. Now the estimated ITD is closer to correct. Stern, Bernstein, and Trahiotis 11 10/11/12 Comparison of the model to data. X! Stern, Bernstein, and Trahiotis Binaural unmasking: Using localization to reduce interference or masking: the cocktail party effect. 12 10/11/12 Binaural masking level differences are a part of the explanation. The results are from tone detection experiments in a noise masker. The relative phases in the two ears of the tone (S) and noise (N) are given by the subscripts. Noise with a a different interaural phase (different location) is less effective in masking a tone (by up to 15 dB!). The effect is strongest at low frequencies, but continues at high frequencies (ordinate is N0S0 / NπS0). This corresponds roughly to the strength of phase-locking in the auditory nerve. An important deficit in hearing impairment is the loss of binaural unmasking. Note the speech reception thresholds (SRT, the signal/ noise ratio at threshold for speech intelligibility) are worse in impaired listeners for various noise conditions. Bronkhorst Plomp 1989 13 10/11/12 Using localization to suppress echoes: the precedence effect With reverberation, the first sound that arrives (black Xs) is more accurate than subsequent sound (gray dots). Direction to which ITDs point for a 580 Hz tone at three directions. The first few ms of information are more accurate in the cases with reverberation. Data from a MSO model with 4 ms sequential analysis bins. Shinn-Cunningham et al. 2003 14 10/11/12 Precedence decreases the information about location for the second of two stimuli, presumed to be an echo. 15
Similar documents
ELEC-C5340 Tilakuulon ja binauraalisen tekniikan perusteet
light-sensitive cells. Retina cells are thus per se sensitive to direction. Eye: very high spatial accuracy, limited range of wavelengths (400 800nm) Ear: cochlea sensitive to vast range of wavelen...
More information