4.4.2 Source

Source is a data type that is used to express the sound source localization information. In HARK  , LocalizeMUSIC  (output), SourceTracker  (input-output), GHDSS  (input), a sequence of nodes such as this, from sound source localization to speech recognition, will set ObjectRef of Map<int, ObjectRef> to point to the Source .

Source has the following information below:

  1. ID: int  type. Sound source ID

  2. Power: float  type. Power of localized direction.

  3. Coordinate: float  type, 3-dimensional array. The Cartesian coordinate of a sphere unit that corresponds to the sound source direction.

  4. Duration: double  type. The number of frames before a localized sound source is terminated. This value decreases when the sound is not detected. When it reaches 0, the sound source will be terminated. This is an internal variable only used by the SourceTracker node.

  5. TF index: int  type. Indicates the index of the localized direction within the Transfer Function file.

——–

Problem

Read this to know more about nodes that uses (Map $<\cdot $,$\cdot >$) for input/output such as MFCCExtraction  and SpeechRecognitionClient .

Solution

The Map  data type is a set of key and the data that corresponds to that key. For example, in a 3-speaker simultaneous speech recognition, features are separated for each speaker. Then, each feature is assigned a key based on the speaker’s speech index ID. The key and data are then handled as a set. Through this each speaker and speech can be distinguished.