HARK Document Version 3.4.0. (Revision: 9509) : TransferFunction

4.4.3 TransferFunction

TransferFunction is a data type that is used to express the transfer function information. In HARK , EstimateTF (output), LocalizeMUSIC (input), GHDSS (input), a sequence of nodes such as this will apply the estimated transfer function to the localization and separation.

TransferFunction has the following information below:

Informationtype: string type. The kind of the file. "transfer function" means that the file includes all information. "partial transfer function" means that the file includes only the updated information.
Source position: Vector $<$ Position $>$ type. Source localization information for the transfer function. When the Filetype is "partial transfer function", this parameter is empty.
Neighbor information for source positions: Vector $<$ Neighbor $>$ type. Neighbor information for the source. When the Filetype is "partial transfer function", this parameter is empty.
Microphone position: Vector $<$ Position $>$ type. Microphone information for the transfer function. When the Filetype is "partial transfer function", this parameter is empty.
Transfer function for localization: Map $<$ ID,Matrix $<$ complex $<$ float $>$ $>$ $>$ type. The transfer function for localization.
Transfer function for separation: Map $<$ ID,Matrix $<$ complex $<$ float $>$ $>$ $>$ type. The transfer function for separation.

4.4.3.1 Position type

Position type, which expressess the microphone and source position of the transfer function, has the following information below:

ID: int type. ID of the position.
Coordinate type: Coordinate type. Coodinate type.
Coordinate: float type. Coordinate of the position.
Path: string type.Path of the wave file.
Matrix data: Matrix $<$ complex $<$ float $>$ $>$ type. Matrix data. This parameter is not used.
Enable the channnel set information: int type. Enable the channel set information. This parameter is not used.
Channel set information: Vector<int> type. Channel set information. This parameter is not used.

4.4.3.2 Neighbor type

Neighbor type, which expresses the neighbor information of the source, has the folllowing information below:

ID: Vector<int> type. IDs which has the neighbors.
Neighbor( ID ): Vector $<$ Vector<int> $>$ type. Neighbor information by ID.
Neighbor( Position ): Vector $<$ Vector $<$ Position $>$ $>$ type. Neighbor information by Position type.
Algorithm: NeighborAlgorithm type. The search algorithm for the neighbors.

4.4.3.3 Config type

Config type, which expresses the configuration information, has the following information below:

Comment: string type. The description of the file. Any string is acceptable.
Synchronous Average: int type. The number of repetition of a signal used for transfer function measurement (TSP signal).
Path: string type. The path of the audio file for transfer function measurement (TSP signal).
Offset: int type. The offset during transfer function calculation.
Length: int type. The length of a signal for transfer function measurement (TSP signal) in a sample.
Begin index for peak search: int type. The begin index when searching for the peak of the direct sound during transfer function calculation.
End index for peak search: int type. The end index when searching for the peak of the direct sound during transfer function calculation.
FFT length: int type. The length of Fourier transform during transfer function calculation.
Sampling rate: int type. The sampling rate.
signal Max: int type. The maximum amplitude of the recorded signal for transfer function measurement.

——–

Problem

Read this to know more about nodes that uses (Map $<\cdot$ , $\cdot >$ ) for input/output such as MFCCExtraction and SpeechRecognitionClient .

Solution

The Map data type is a set of key and the data that corresponds to that key. For example, in a 3-speaker simultaneous speech recognition, features are separated for each speaker. Then, each feature is assigned a key based on the speaker’s speech index ID. The key and data are then handled as a set. Through this each speaker and speech can be distinguished.