HARK Document Version 3.2.0. (Revision: 9448) : MultiAudioStreamFromMic

6.1.2 MultiAudioStreamFromMic

6.1.2.1 Outline of the node

Takes multi-channel speech waveform data from multi microphone arrays. This node is an enhanced version of AudioStreamFromMic to correspond to multiple devices. MultiAudioStreamFromMic corrects the difference in the number of frames between microphone devices when data is received over a long period of time.

6.1.2.2 Necessary file

No files are required.

6.1.2.3 Usage

When to use

This node is used to deal with multi-channel speech waveform data from multi microphone arrays as the input for HARK system. Note that it requires that the all microphone arrays are the same model, or that they have the same specification.

Typical connection

Figure 6.7 shows a connection example of the MultiAudioStreamFromMic .

$\includegraphics[width=0.8\textwidth ]{fig/modules/MultiAudioStreamFromMic}$

Figure 6.7: Example of the MultiAudioStreamFromMic in LOOP0

6.1.2.4 Input-output and property of the node

Table 6.3: Parameter list of MultiAudioStreamFromMic

Parameter name	Type	Default value	Unit	Description
LENGTH	`int`	512	[pt]	Frame length as a fundamental unit for processing.
ADVANCE	`int`	160	[pt]	Frame shift length.
CHANNEL_COUNT	`int`	8	[ch]	Microphone input channel number of a device to use.
SAMPLING_RATE	`int`	16000	[Hz]	Sampling frequency of audio waveform data loaded.
DEVICETYPE	`string`	WS		Type of device to be used.
GAIN	`string`	0dB		Gain value used with RASP device.
DEVICE	`string`	/dev/null		A list of identification names required to access the devices.
FRAME_COUNT_SKEW_TOLERANCE	`float`	5.0	[sec]	The tolerance in frame number in seconds.

Input

None.

Output

AUDIO0: : Matrix<float> type. Indexed, multichannel audio waveform data with rows as channels and columns as samples. Size of the column is equal to the parameter LENGTH. This output terminals is hidden by default. The output terminal corresponding to each device needs to be added manually.
NOT_EOF: : bool type. This indicates whether there is still input from the waveform to be processed. Used as an ending flag when processing the waveforms in a loop. When it is true, waveforms are loaded, and when it is false, reading is complete. true is output continuously.

Please see the figure 6.8 to add the output terminals to the node.

$\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output1}$
Step 1: Right click on MultiAudioStreamFromMic and then, click Add Output

$\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output2}$
Step 2: Type AUDIO0 in the input field for Name of Add Output and then, click Add

$\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output3}$
Step 3: The output terminal, AUDIO0, is added to the node

$\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output4}$
Step 4: Repeat from Step 1 to Step 3 to add the output terminals as needed

Figure 6.8: Steps to add output terminals

Parameter

LENGTH, ADVANCE, CHANNEL_COUNT, SAMPLING_RATE, DEVICETYPE, GAIN: : Refer the parameter of the AudioStreamFromMic node. These values are the same for each device.
DEVICE: : string type. A list of identification names required to access the device. Specify identification names of multiple devices separated by space characters. Each separated identification name of a device is the same as DEVICE of AudioStreamFromMic .
FRAME_COUNT_SKEW_TOLERANCE: : float type. The tolerance in frame number in seconds.

6.1.2.5 Details of the node

This node is an enhanced version of AudioStreamFromMic to deal with multiple devices. Each device specified in the list of devices separated by blank space of the DEVICE parameter corresponds to each output terminal AUDIO arranged in ascending order of the serial number. In the FRAME_COUNT_SKEW_TOLERANCE parameter, specify the timing to correct the difference in the number of frames. When the difference between the maximum value and the minimum value in the speech waveform data from multiple devices reaches the value specified by FRAME_COUNT_SKEW_TOLERANCE, the data that is the maximum will be deleted by the amount equivalent to the value specified in the FRAME_COUNT_SKEW_TOLERANCE parameter.