8.2 Tuning parameters of sound source localization

Problem

How should I adjust the parameters when sound source localization is suboptimal?

Solution

The solution is given for each sound localization problem.

Q.1) Localization directions are indicated poorly messily or are not indicated at all.

 

When displaying a localization result in DisplayLocalization , the localization directions may not be are indicated precisely messily in some cases. This is due to the use of a low power source of sound has been localized as a sound source. If no directions are shown When results are not indicated at all, it is because of the opposite reason.

  • Change THRESH of SourceTracker   
    This is the parameter that directly changes the expected threshold value of the direction of a sound source. It should be adjusted so that only the peak of the sound source is captured well.

  • Make NUM_SOURCE of LocalizeMUSIC equal to the number of target sounds  
    This enhances the peak of the target sound direction with NULL space, so that the number of peaks to be enhanced changes according to the setting of NUM_SOURCE (number of sound sources). When this setting is wrong, the performance deteriorates, including localizing a peak in the direction of noise or no peak in the direction of the target (In actual localization, only the sharpness of the peaks is degraded, so they still can be used for localization.) If there is only one speaker, the performance will be improved by setting NUM_SOURCE1.

Q. 2) Only one peak appears even though there are plural sound sources

 

  • Change MIN_SRC_INTERVALin SourceTracker  when there are sound sources nearby (e.g. two sound sources only 10 degrees away from each other). It may be necessary to set the value of MIN_SRC_INTERVAL sufficiently small (less than 10 degrees in this example). When the set value is greater than the angle difference, the two sound sources are localized as one sound source.

  • Make NUM_SOURCEof LocalizeMUSIC equal to the number of the target sounds Same as A.2-1). Note that if the volume is loud enough, and the sounds are far enough apart (more than 40 deg), localization is usually sufficient well even if the parameter is ill configured.

Q.3) Non-vocal sound is used

 
LOWER_BOUND_FREQUENCYand UPPER_BOUND_FREQUENCYof LocalizeMUSIC 
Sound source localization is processed for each frequency bin designated for these two frequencies. Therefore, setting a frequency totally different from that of the target sound will result in a wider peak. Use frequencies that correspond to those of target sound sources.

Q.4) I can assume that the sound does NOT come from a certain range.

 
MIN_DEGand MAX_DEGof LocalizeMUSIC 
The sound source localization is performed only for the range determined by designating these two values. When wishing to perform localization for 360 degrees, make sure to designate 180 degrees and -180 degrees.

Discussion

The solutions are parameter tuning of sound source localization. However, if the reverberation of your room is significantly different from the one of which you record the transfer function, you need to re-measure the transfer function. See HARK web page for the transfer function measurement instruction video.

See Also

See LocalizeMUSIC in the HARK document for a detailed description of the MUSIC algorithm and parameters.