HARK Document Version 2.1.0. (Revision: 7522) : SpeechRecognitionClient

6.6.1 SpeechRecognitionClient

6.6.1.1 Outline of the node

This node sends acoustic features to a speech recognition node via a network connection.

6.6.1.2 Necessary file

No files are required.

When to use

This node is used to send acoustic features to software out of HARK. For example, it sends them to the large vocabulary continuous speech recognition software Julius $^{(1)}$ to perform speech recognition.

Typical connection

$\includegraphics[width=100mm]{fig/modules/SpeechRecognitionClient}$

Figure 6.81: Connection example of SpeechRecognitionClient

6.6.1.3 Input-output and property of the node

Table 6.70: Parameter list of SpeechRecognitionClient

Parameter name	Type	Default value	Unit	Description
MFM_ENABLED	`bool`	`true`		Select whether or not to send out missing feature masks
HOST	`string`	127.0.0.1		Host name /IP address of the server on which Julius/Julian is running
PORT	`int`	5530		Port number for sending out to network
SOCKET_ENABLED	`bool`	`true`		The flag that determines whether or not to output to the socket

Input

FEATURES: : Map<int, ObjectRef> type. A data pair consisting of a sound source ID and feature vector of type Vector<float> .
MASKS: : Map<int, ObjectRef> type. A data pair consisting of a sound source ID and mask vector of type Vector<float> .
SOURCES: : Vector<ObjectRef> type.

Output

OUTPUT: : Vector<ObjectRef> type.

Parameter

MFM_ENABLED: : bool type. When true is selected, MASKS is transmitted. When false is selected, MASKS input is ignored; a mask of all 1’s is transmitted.
HOST: : string type. IP address of a host that transmits acoustic parameters. When SOCKET_ENABLED is set to false, it is not used.
PORT: : int type. The socket number to transfer acoustic parameters. When SOCKET_ENABLED is set to false, it is not used.
SOCKET_ENABLED: : bool type. When true, acoustic parameters are transmitted to the socket and when false, they are not transmitted.

6.6.1.4 Details of the node

When MFM_ENABLED is set to true and SOCKET_ENABLED, this node sends acoustic features and mask vectors to the speech recognition module via the network port. When false is selected for MFM_ENABLED, normal speech recognition not based on the missing feature theory is performed. In practice, mask vectors are sent out with all elements set to 1, all acoustic features as reliable in other words. When false is selected for SOCKET_ENABLED, the features are not sent to the speech recognition node. This is used to perform checks of the HARK network file without running the external speech recognition engine. For HOST, designate the IP address of HOST on which the external program that sends vectors runs. For PORT, designate a network port number to send the vector.

6.6.1.5 References:

(1) http://julius.sourceforge.jp/en_index.php