4.4.2 Source

This is a type that indicates source localization information. In HARK  , it is used as the information that ObjectRef  of Map<int, ObjectRef>  points to during processing of sound source localization to the sound source separation, namely, LocalizeMUSIC  (output), SourceTracker  (input-output), GHDSS  (input). Source  type contains the following information.

  1. ID: int  type. ID of sound source

  2. Power: float  type. Power of the direction localized.

  3. Coordinate: An array that is 3 in length of float  type. The Cartesian coordinate on a unit ball corresponding to a source localizing direction.

  4. Duration: double  type. The number of frames before a localized sound source is terminated. When a corresponding sound is not detected, this value decreases. When it becomes 0, the sound source is terminated. This is an internal variable which is only used in SourceTracker .

  5. TF index: int  type. The serial number of the transfer function corresponding to the localized direction in the transfer-function file.

——–

Problem

Read this when wishing to know about the data type (Map $<\cdot $$\cdot >$) used for inputs and outputs of nodes such as MFCCExtraction  or SpeechRecognitionClient .

Solution

The Map  type is a type consisting of groups of keys and data corresponding to the keys. For example, when performing three-speaker simultaneous recognition, it is necessary to distinguish the features used for the speech recognition for each speaker. Therefore, the key is the ID that indicates which of the three speakers the feature corresponds to, and what number utterance the utterance is. Speakers / utterances are distinguished by treating the key and data as a set.