10.2 Selecting the threshold for Missing Feature Mask

Problem

Read this section if you do not know how to set parameters of the MFMGeneration module.

Solution

MFMGeneration includes the parameter THRESHOLD, which affects the performance of speech recognition. If the threshold value is set to 0.0, speech recognition will not be based on the missing feature theory. If it is set to 1.0, all features are covered with masks and therefore recognition is performed without any features. A suitable value is obtained experimentally through actual recognition, by changing threshold values in increments of 0.1.

Discussion

MFMGeneration is expressed by the following equation. Reliability is threshold-processed in THRESHOLD, a mask that uses the two values of 0.0 (unreliable) and 1.0 (reliable) (hard mask).

$\displaystyle m(f,p)= \left\{ \begin{array}{cc} 1.0, & r(p)> THRESHOLD \nonumber \\ 0.0, & r(p) \leq THRESHOLD \nonumber \end{array} \right. $

(1)

where $f$ ,$p$, $m(f, p)$, and $r(p)$ represent the frame, dimension, mask, and reliability of a feature, respectively.