6.1.5 HarkDataStreamSender

6.1.5.1 Details of the node

This node sends the following acoustic signal results by socket communication.

6.1.5.2 Necessary file

No files are required.

6.1.5.3 Usage

When to use

This node is used to send the above data to a system external to HARK using TCP/IP communication.

Typical connection

In the example in Figure 6.10, all input terminals are connected. It is also possible to leave input terminals open depending on the transmitted data. To learn about the relation between the connection of the input terminals and transmitted data, see “Details of the node”.

\includegraphics[width=150mm]{fig/modules/HarkDataStreamSender}
Figure 6.10: Connection example of HarkDataStreamSender 

6.1.5.4 Input-output and property of the node

Table 6.6: Parameter list of HarkDataStreamSender 

Parameter name

Type

Default value

Unit

Description

HOST

string 

localhost

 

Host name /IP address of the server to which data is sent

PORT

int 

8890

 

Port number for outbound network communication

ADVANCE

int 

160

[pt]

Shift length of frame

BUFFER_SIZE

int 

512

 

Size of float variables for socket communication

FRAMES_PER_SEND

int 

1

[frm]

Frequency of socket communication in frame unit

TIMESTAMP_TYPE

string 

GETTIMEOGDAY

 

Time stamped to the sent data

SAMPLING_RATE

int 

16000

[Hz]

Sampling frequency

DEBUG_PRINT

bool 

false

 

ON/OFF for outputting debugging information

SOCKET_ENABLE

bool 

true

 

Flag to determine whether or not to use the socket output

Input

MIC_WAVE

: Matrix<float>  type. Acoustic signal (The number of channels $\times $ acoustic signal of window length size STFT in each channel)

MIC_FFT

: Matrix<complex<float> >  type. Frequency spectrum (The number of channels $\times $ spectrum of each channel)

SRC_INFO

: Vector<ObjectRef>  type. Source information on the source localization results of several sound sources

SRC_WAVE

: Map<int, ObjectRef>  type. A sound source ID and acoustic signal (Vector<float>  type) data pair.

SRC_FFT

: Map<int, ObjectRef>  type. A sound source ID and frequency spectrum (Vector<complex<float> >  type) data pair.

SRC_FEATURE

: Map<int, ObjectRef>  type. A sound source ID and acoustic feature (Vector<float>  type) data pair.

SRC_RELIABILITY

: Map<int, ObjectRef>  type. A sound source ID and mask vector (Vector<float>  type) data pair.

TEXT

:  type. Arbitrary texts.

MATRIX

: Matrix<float> or Matrix<complex<float> > type. Arbitrary matrices.

VECTOR

: Vector<float> or Vector<complex<float> > type. Arbitrary vectors.

TIMESTAMP

: TimeStamp type. Time stamp in the sent data.

Output

OUTPUT

: ObjectRef  type. Same output as the input.

Parameter

HOST

: string  type. IP address of a host to which data is transmitted. It is invalid when SOCKET_ENABLED is set to false.

PORT

: int  type. Socket number. It is invalid when SOCKET_ENABLED is set to false.

ADVANCE

: int  type. Shift length of a frame. It must be equal to the value set in previous processing.

BUFFER_SIZE

: int  type. Buffer size secured for socket communication.

FRAMES_PER_SEND

: int  type. Frequency of socket communication in frame unit.

TIMESTAMP_TYPE

: string  type. Setting for time stamped to sent data. If TIMESTAMP_TYPE=GETTIMEOFDAY, the time taken by gettimeofday is stamped. If TIMESTAMP_TYPE=CONSTANT_INCREMENT, the frame time calculated by SAMPLING_RATE is incremented to the stamped current time.

SAMPLING_RATE

: int  type. Sampling frequency of the input signal. This is valid only when
TIMESTAMP_TYPE=CONSTANT_INCREMENT.

DEBUG_PRINT

: bool  type. ON/OFF of debug to standard output.

SOCKET_ENABLE

: bool  type. Data is transferred to the socket when true and not transferred when false.

6.1.5.5 Details of the node

For HOST, designate a host name or an IP address of the host running an external program to transmit data. For PORT, designate a network port number for data transmission. ADVANCE is the shift length of a frame and must be equal to the value set in previous processing. BUFFER_SIZE is a buffer size to be secured for socket communication. A float type array of BUFFER_SIZE * 1024 is secured at the time of initialization. It must be greater than the transmitted data. FRAMES_PER_SEND is the frequency of socket communication in frame unit. The default value is 1 and sufficient for the most cases, which sends data in every frame. If you want to reduce the amount of socket communication, increase this value. TIMESTAMP_TYPE is the setting for time stamped to sent data. SAMPLING_RATE is the sampling frequency of the input signal. DEBUG_PRINT indicates if debug to standard output should be displayed. This outputs some parts of the transmitted data. For more information, see “Debug” in Table 6.13. When SOCKET_ENABLED is set to false, data is not sent to external systems. This is used to perform a network operation check for HARK without operating an external program.

(B-1) Structure for data transmission

Data transmission is performed for each frame, being divided into some parts. The structures defined for data transmission are listed as follows.

Table 6.7: Member of HD_Header

Variable name

Type

Description

type

int

Bit flag that indicates the structure of the transmitted data.

   

For relations between each bit and data to be transmitted, see Table 6.8.

advance

int

Shift length of a frame

count

int

Frame number of HARK

tv_sec

int64

timestamp of HARK in seconds

tv_usec

int64

timestamp of HARK in micro-seconds

Table 6.8: Each bit and transmit data of the types of HD_Header

Number of digits

Related input terminal

Transmit data

The first column

MIC_WAVE

Acoustic signal

The second column

MIC_FFT

Frequency spectrum

The third column

SRC_INFO

Source localization result source information

The fourth column

SRC_INFO, SRC_WAVE

Source localization result source information

   

+ acoustic signal for each sound source ID

The fifth column

SRC_INFO, SRC_FFT

Source localization result source information

   

+ frequency spectrum for each sound source ID

The sixth column

SRC_INFO, SRC_FEATURE

Source localization result source information

   

+ acoustic feature for each sound source ID

The seventh column

SRC_INFO, SRC_RELIABILITY

Source localization result source information

   

+ missing feature mask for each sound source ID

The eighth column

TEXTS

Arbitrary texts

The ninth column

MATRIX

Arbitrary matrices

The tenth column

VECTOR

Arbitrary vectors

In HarkDataStreamSender , The transmitted data differs depending on whether the input terminal can be opened. On the receiving end, the transmitted data can be interpreted according to their types. Examples are given below. Further details on transmitted data are given in (B-2).

Table 6.9: Member of HDH_MicData

Variable name

Type

Description

nch

int

Number of microphone channels

length

int

Data length (number of columns of the two-dimensional array to be transmitted)

data_bytes

int

Number of bytes of data to be transmitted. In the case of a float type matrix,

   

nch * length * sizeof(float).

Table 6.10: Member of HDH_SrcInfo

Variable name

Type

Description

src_id

int

Sound source ID

x[3]

float

Three-dimensional position of sound source

power

float

Power of the MUSIC spectrum calculated in LocalizeMUSIC 

Table 6.11: Member of HDH_SrcData

Variable name

Type

Description

length

int

Data length (number of one-dimensional array elements to be transmitted)

data_bytes

int

Number of bytes of transmitted data. In the case of a float type vector, length * sizeof(float).

(B-2) Transmitted data

Table 6.12: Data list in order of sending and connection input terminal (The data with the $\circ $ symbol is transmitted. $\circ ^*$ indicates the data that are not transmitted when the SRC_INFO terminal is not connected)

Details of the transmitted data

Input terminal and transmitted data

 

Type

Size

MIC_WAVE

MIC_FFT

SRC_INFO

SRC_WAVE

SRC_FFT

SRC_FEATURE

SRC_RELIABILITY

TEXT

MATRIX

VECTOR

(a)

HD_Header

sizeof(HD_Header)

$\circ $

$\circ $

$\circ $

$\circ $

$\circ $

$\circ $

$\circ $

$\circ $

$\circ $

$\circ $

(b)

HDH_MicData

sizeof(HDH_MicData)

$\circ $

                 

(c)

float[]

HDH_MicData.data_bytes

$\circ $

                 

(d)

HDH_MicData

sizeof(HDH_MicData)

 

$\circ $

               

(e)

float[]

HDH_MicData.data_bytes

 

$\circ $

               

(f)

float[]

HDH_MicData.data_bytes

 

$\circ $

               

(g)

int

1 * sizeof(int)

   

$\circ $

$\circ ^*$

$\circ ^*$

$\circ ^*$

$\circ ^*$

     

(h)

HDH_SrcInfo

sizeof(HDH_SrcInfo)

   

$\circ $

$\circ ^*$

$\circ ^*$

$\circ ^*$

$\circ ^*$

     

(i)

HDH_SrcData

sizeof(HDH_SrcData)

     

$\circ ^*$

           

(j)

short int[]

HDH_SrcData.data_bytes

     

$\circ ^*$

           

(k)

HDH_SrcData

sizeof(HDH_SrcData)

       

$\circ ^*$

         

(l)

float[]

HDH_SrcData.data_bytes

       

$\circ ^*$

         

(m)

float[]

HDH_SrcData.data_bytes

       

$\circ ^*$

         

(n)

HDH_SrcData

sizeof(HDH_SrcData)

         

$\circ ^*$

       

(o)

float[]

HDH_SrcData.data_bytes

         

$\circ ^*$

       

(p)

HDH_SrcData

sizeof(HDH_SrcData)

           

$\circ ^*$

     

(q)

float[]

HDH_SrcData.data_bytes

           

$\circ ^*$

     

(r)

HDH_SrcData

sizeof(HDH_SrcData)

             

$\circ ^*$

   

(s)

char[]

HDH_SrcData.data_bytes

             

$\circ ^*$

   

(t)

HDH_MicData

sizeof(HDH_MicData)

               

$\circ ^*$

 

(u)

float[]

HDH_MicData.data_bytes

               

$\circ ^*$

 

(v)

HDH_SrcData

sizeof(HDH_SrcData)

                 

$\circ ^*$

(w)

float[]

HDH_SrcData.data_bytes

                 

$\circ ^*$

Table 6.13: Details of the transmitted data
 

Description

Debug

(a)

Transmitted data header. See Table 6.7.

$\circ $

(b)

Structure of acoustic signals

$\circ $

 

(number of microphones, frame length, byte count for transmission). See Table 6.9.

 

(c)

Acoustic signal (number of microphones $\times $ float type matrix of frame length)

 

(d)

Structure of frequency spectra

$\circ $

 

(number of microphones, number of frequency bins, byte count for transmission). See Table 6.9.

 

(e)

Real part of frequency spectrum

 
 

(number of microphones $\times $ float type matrix of number of frequency bins)

 

(f)

Imaginary part of frequency spectrum

 
 

(number of microphones $\times $ float type matrix of number of frequency bins)

 

(g)

Number of sound sources detected

$\circ $

(h)

Source of a source location result. See Table 6.10.

$\circ $

(i)

Structure that indicates that of acoustic signals for each sound source ID

$\circ $

 

(frame length, byte count for transmission). See Table 6.11.

 

(j)

Acoustic signal for each sound source ID (float type linear array of frame length)

 

(k)

Structure that indicates that of frequency spectra for each sound source ID

$\circ $

 

(number of frequency bins, byte count for transmission). See Table 6.11.

 

(l)

Real part of a frequency spectrum for each sound source ID

 
 

(float type linear array of number of frequency bins)

 

(m)

Imaginary part of a frequency spectrum for each sound source ID

 
 

(float type linear array of number of frequency bins)

 

(n)

Structure that indicates that of acoustic features for each sound source ID

$\circ $

 

(dimension number of features, byte count for transmission). See Table 6.11.

 

(o)

Acoustic feature for each sound source ID (float type linear array of dimension number of features)

 

(p)

Structure that indicates that of MFM for each sound source ID

$\circ $

 

(dimension number of features, byte count for transmission). See Table 6.11.

 

(q)

MFM for each sound source ID (float type linear array of dimension number of features)

 

(r)

Text information (number of characters, byte count for transmission). See Table 6.11.

$\circ $

(s)

Text (char type linear array of dimension number of features)

 

(t)

Structure of the sent matrix (number of rows and columns and byte count). See Table 6.9.

$\circ $

(u)

Data of the matrix (float type matrix)

 

(v)

Structure of the sent vector (size of the vector and byte count). See Table 6.11.

$\circ $

(w)

Data of the vector (float type linear array)

 

Transmitted data is divided for each frame as shown in (a)-(w) of Tables 6.12 and 6.13. Table 6.12 shows the relation between the transmitted data (a)-(w) and the input terminal connected, and Table 6.13 describes the transmitted data.

  calculate{
  Send (a)
  IF MIC_WAVE is connected
  Send (b)
  Send (c)
  ENDIF
  IF MIC_FFT is connected
  Send (d)
  Send (e)
  Send (f)
  ENDIF
  IF SRC_INFO is connected
  Send (g)
  (Let the number of sounds ’src_num’.)
FO  R i = 1 to src_num (This is a sound ID based routine.)
  Send (h)
  IF SRC_WAVE is connected
  Send (i)
  Send (j)
  ENDIF
  IF SRC_FFT is connected
  Send (k)
  Send (l)
  Send (m)
  ENDIF
  IF SRC_FEATURE is connected
  Send (n)
  Send (o)
  ENDIF
  IF SRC_RELIABILITY is connected
  Send (p)
  Send (q)
  ENDIF
  ENDFOR
  ENDIF
  IF TEXT is connected
  Send (r)
  Send (s)
  ENDIF
  IF MATRIX is connected
  Send (t)
  Send (u)
  ENDIF
  IF VECTOR is connected
  Send (v)
  Send (w)
  ENDIF
}
  
(B-3) Transmission algorithm Some parts of the algorithm that operate in a loop when executing the HARK network file are shown above. Here, (a)-(w) in the code correspond to (a)-(w) in Tables 6.12 and 6.13.