6.1.2 MultiAudioStreamFromMic

6.1.2.1 Outline of the node

Takes multi-channel speech waveform data from multi microphone arrays. This node is an enhanced version of AudioStreamFromMic to correspond to multiple devices. MultiAudioStreamFromMic corrects the difference in the number of frames between microphone devices when data is received over a long period of time.

6.1.2.2 Necessary file

No files are required.

6.1.2.3 Usage

When to use

This node is used to deal with multi-channel speech waveform data from multi microphone arrays as the input for HARK system. Note that it requires that the all microphone arrays are the same model, or that they have the same specification.

Typical connection

Figure 6.7 shows a connection example of the MultiAudioStreamFromMic .

\includegraphics[width=0.8\textwidth ]{fig/modules/MultiAudioStreamFromMic}
Figure 6.7: Example of the MultiAudioStreamFromMic in LOOP0

6.1.2.4 Input-output and property of the node

Table 6.3: Parameter list of MultiAudioStreamFromMic 

Parameter name

Type

Default value

Unit

Description

LENGTH

int 

512

[pt]

Frame length as a fundamental unit for processing.

ADVANCE

int 

160

[pt]

Frame shift length.

CHANNEL_COUNT

int 

8

[ch]

Microphone input channel number of a device to use.

SAMPLING_RATE

int 

16000

[Hz]

Sampling frequency of audio waveform data loaded.

DEVICETYPE

string 

WS

 

Type of device to be used.

GAIN

string 

0dB

 

Gain value used with RASP device.

DEVICE

string 

/dev/null

 

A list of identification names required to access the devices.

FRAME_COUNT_SKEW_TOLERANCE

float 

5.0

[sec]

The tolerance in frame number in seconds.

Input

None.

Output

AUDIO0

: Matrix<float>  type. Indexed, multichannel audio waveform data with rows as channels and columns as samples. Size of the column is equal to the parameter LENGTH. This output terminals is hidden by default. The output terminal corresponding to each device needs to be added manually.

NOT_EOF

: bool  type. This indicates whether there is still input from the waveform to be processed. Used as an ending flag when processing the waveforms in a loop. When it is true, waveforms are loaded, and when it is false, reading is complete. true is output continuously.

Please see the figure 6.8 to add the output terminals to the node.

\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output1}
Step 1: Right click on MultiAudioStreamFromMic and then, click Add Output
\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output2}
Step 2: Type AUDIO0 in the input field for Name of Add Output and then, click Add
\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output3}
Step 3: The output terminal, AUDIO0, is added to the node
\includegraphics[width=\linewidth ]{fig/modules/MultiAudioStreamFromMic_output4}
Step 4: Repeat from Step 1 to Step 3 to add the output terminals as needed
Figure 6.8: Steps to add output terminals

Parameter

LENGTH, ADVANCE, CHANNEL_COUNT, SAMPLING_RATE, DEVICETYPE, GAIN

: Refer the parameter of the AudioStreamFromMic node. These values are the same for each device.

DEVICE

: string  type. A list of identification names required to access the device. Specify identification names of multiple devices separated by space characters. Each separated identification name of a device is the same as DEVICE of AudioStreamFromMic .

FRAME_COUNT_SKEW_TOLERANCE

: float  type. The tolerance in frame number in seconds.

6.1.2.5 Details of the node

This node is an enhanced version of AudioStreamFromMic to deal with multiple devices. Each device specified in the list of devices separated by blank space of the DEVICE parameter corresponds to each output terminal AUDIO arranged in ascending order of the serial number. In the FRAME_COUNT_SKEW_TOLERANCE parameter, specify the timing to correct the difference in the number of frames. When the difference between the maximum value and the minimum value in the speech waveform data from multiple devices reaches the value specified by FRAME_COUNT_SKEW_TOLERANCE, the data that is the maximum will be deleted by the amount equivalent to the value specified in the FRAME_COUNT_SKEW_TOLERANCE parameter.