6.2.4 CMMakerFromFFT

Module Overview

From the multi-channel complex spectrum that is output from the MultiFFT  node, generate the sound source correlation matrix with a fixed period.

Requested Files

None.

Usage

In what case is the node used?

Given a sound source of LocalizeMUSIC  node, in order to suppress a specific sound source like noise, etc., it is necessary to prepare a correlation matrix that includes noise information beforehand. This node generates the correlation matrix for a sound source, at fixed period, from a multi-channel complex spectrum that is output from the MultiFFT  node. Suppressed sound source can be achieved by connecting the output of this node to the NOISECM input terminal of LocalizeMUSIC  node, assuming that information before a fixed period is always noise.

Typical Examples

Figure. 6.13 shows the usage example of CMMakerFromFFT  node.

INPUT The input terminal is connected to the complex spectrum of the input signal calculated from a MultiFFT  node. The type is Matrix<complex<float> >  type. This node calculates and outputs the correlation matrix between channels for each frequency bin from the complex spectrum of an input signal. The output type is Matrix<complex<float> >  type, but to handle a correlation matrix, convert the three dimensional complex array to a two dimensional complex array and then output.

\includegraphics[width=.6\textwidth ]{fig/modules/CMMakerFromFFT.eps}
Figure 6.13: Network Example using CMMakerFromFFT 

I/O and property setting of the node

Table 6.17: Parameter list of CMMakerFromFFT 

Parameter

Type

Default

Unit

Description

NB_CHANNELS

int 

8

 

Number of channels $M$ of input signal

LENGTH

int 

512

 

Frame length $NFFT$

PERIOD

int 

50

 

Number of average smoothed frames for the correlation matrix

ENABLE_DEBUG

bool 

false

 

ON/OFF of debugging information output

Input

INPUT

Matrix<complex<float> >  type, the complex spectrum expression of an input signal $M \times ( NFFT / 2 + 1)$.

Output

OUTPUT

Matrix<complex<float> >  type. A correlation matrix for each frequency bin. An $M$-th order complex square array with correlation matrix outputs $NFFT/2 + 1$ items. Matrix<complex<float> >  indicates the rows corresponding to frequency ($NFFT/2 + 1$ rows), and columns containing the complex correlation matrix ($M * M$ columns across).

Parameter

NB_CHANNELS

int  type. Number of channels in the input signal. Equivalent to the order of the correlation matrix. Must be matched with the order of the former correlation matrix used. Default value is 8.

LENGTH

int  type. Default value is 512. FFT points at the time of Fourier transform. Must be matched till the former FFT points.

PERIOD

int  type. Default value is 50. Specifies the number of average smoothed frames when calculating the correlation-matrix. The node generates a correlation matrix for each frame from the complex spectrum of the input signal and outputs a new correlation matrix by averaging the frames that are specified in PERIOD. The correlation matrix calculated at the end is output between the PERIOD frames. If this value is increased, the correlation matrix is stabilized but the calculation load is high.

ENABLE_DEBUG

bool  type. Default value is false. When true, the frame number is output to the standard output while generating the correlation matrix.

Module Description

The complex spectrum of the input signal output from a MultiFFT  node is represented as follows.

  \begin{equation}  \label{eqCMMakerFromFFT_ X} {\bm@general \boldmath \m@ne \mv@bold \bm@command X}(\omega ,f) = [X_1(\omega ,f), X_2(\omega ,f), X_3(\omega ,f), \cdots , X_ M(\omega ,f)]^ T \end{equation}   (1)

Here, $\omega $ is the frequency bin number, $f$ is the frame number for use with HARK , $M$ represents the number of input channels.

The correlation matrix of the input signal ${\bm@general \boldmath \m@ne \mv@bold \bm@command X}(\omega ,f)$ can be defined as follows for every frequency and frame.

  \begin{equation}  \label{eqCMMakerFromFFT_ R} {\bm@general \boldmath \m@ne \mv@bold \bm@command R}(\omega ,f) = {\bm@general \boldmath \m@ne \mv@bold \bm@command X}(\omega ,f){\bm@general \boldmath \m@ne \mv@bold \bm@command X}^*(\omega ,f) \end{equation}   (2)

Here, $()^*$ denotes the complex conjugate transpose operator. There is no problem if this ${\bm@general \boldmath \m@ne \mv@bold \bm@command R}(\omega ,f)$ is used as it is in subsequent processing, but practically, in order to obtain a stable correlation matrix in HARK , it uses an average through time as shown below.

  \begin{equation}  \label{eqCMMakerFromFFT_ Rn} {\bm@general \boldmath \m@ne \mv@bold \bm@command R}’(\omega ,f) = \frac{1}{{\rm PERIOD}}\sum _{i=0}^{{\rm PERIOD}-1}{\bm@general \boldmath \m@ne \mv@bold \bm@command R}(\omega ,f+i) \end{equation}   (3)

${\bm@general \boldmath \m@ne \mv@bold \bm@command R}’(\omega ,f)$ is output from the OUTPUT terminal of CMMakerFromFFT  node.