3.3 Sound source localization fails

Problem

My sound source localization system does not work well.

Solution

This recipe introduces the usual debugging procedures for sound source localization systems.

Confirm the recording

 
Make sure the system can record sound. See Sound recording fails.

Offline localization

 
Try offline localization using recorded wave files. Walk around the microphone array while speaking and record the sound. This is your test file.

Replace the node AudioStreamFromMic with AudioStreamFromWave in your network for offline localization. When DisplayLocalization and SourceTracker are connected, the localization result can be visualized.

If this step fails, set the parameter DEBUG to ON of LocalizeMUSIC and check the MUSIC spectrum. This value should be larger in the presence than in the absence of sound. If not, the transfer function file may be wrong. If the spectrum seems successful, adjust the parameter THRESH of SourceTracker so that its value is bigger than in the non-sound area and smaller than in the sound area.

You may want to adjust other parameters;

  1. Set NUM_SOURCE for the condition that the system is intended for.
    For example, if there are up to two speakers, set NUM_SOURCE to 2.

  2. Set MIN_DEG and MAX_DEG.
    If localization results are obtained from the direction where the sound is not located, there may be reflection from the wall or a noise source (e.g. fans of a PC). If you can assume that the target signals do not come form these directions, the localization result can be set such that they are not output from them using the parameters MIN_DEG and MAX_DEG.

    For example, if no sound will come from behind the robot, then the localization target can be set to only the front side by setting MIN_DEG to -90 and MAX_DEG to 90.

Online localization

 
To check the entire system, use the original network file having AudioStreamFromMic and run your program. You may need to tune the THRESH of SourceTracker because of the different position of the talker or volume of the talker’s voice. If the system is too sensitive, decrease the THRESH, otherwise, increase it. The amount might be around 0.1 - 0.2.

Discussion

It may be appropriate to change parameters, depending on reverberations in the room and the volume of the speaker’s voice. To increase the localization performance of the current system, it may be better to tune up in the actual location. Since THRESH is the most important parameter here, adjusting it only will be the first thing to try.

See Also

If you are new to building a sound source localization system, refer to Learning sound localization. The chapter Sound source localization includes some recipes for localization. If you want to know the detailed algorithm, see LocalizeMUSIC and SourceTracker in the HARK document.