5 File Formats

This chapter describes the kind of files and these formats used in HARK  . Since various and complex file formats are used in previous versions of HARK  , it was difficult to understand the whole formats. From HARK  2.1, we simplify them into three main formats and standard formats.

The design policy of the new format is twofold:

  1. Use less HARK special formats and more standard formats.

  2. Provide rich file I/O APIs.

According to the policy, we define a simple binary format for matrix representation **Matrix binary**, and use a combination standard formats such as XML, Zip for other formats . We also provide file I/O library **libharkio3** for developers to write file I/O code easily.

The HARK  mainly uses the following three formats:

  1. XML: The format is used for files that represents positions. The extension is .xml

  2. Matrix binary: The format is used for files that represents a matrix. The extensions is .mat

  3. Zip: The format is used for files that represents a compound format such as transfer functions. The extension is .zip

Other file formats used in previous versions are integrated into the formats above or changed to a standard format.

Table 5.1 shows the node list that includes file I/O.

Table 5.1: HARK nodes that uses file I/O

Node name

Where to use

File type

New format

Old format

SaveRawPCM 

Output

Raw Audio file

Raw Audio

No change

SaveWavePCM 

Output

Wave file

PCM Wave

No change

LocalizeMUSIC 

Property

TF(*) for localization

Zip

HGTF binary

SaveSourceLocation 

Output

Localization result

XML

Localization result text

LoadSourceLocation 

Property

Localization result

XML

Localization result text

GHDSS 

Property

TF(*) for separation

Zip

HGTF binary

 

Property

Microphone positions

XML

HARK text

 

Property

Stationary noise positions

XML

HARK text

 

Property

Initial separation matrix

Zip

HGTF binary

 

Output

Separation matrix

Zip

HGTF binary

SaveFeatures 

Output

Features

Matrix binary

float binary

SaveHTKFeatures 

Output

Features

HTK format

No change

DataLogger 

Output

Map data

XML

Map text

CMSave 

Property

Correlation matrix

Zip

Correlation matrix text

CMLoad 

Output

Correlation matrix

Zip

Correlation matrix text

JuliusMFT 

Commandline

Configuration

jconf (text)

No change

 

in jconf file

AM(*), phoneme list

julius format

No change

 

in jconf file

LM(*), dictionary

julius format

No change

harktool

harktool

Sound source positions list

XML

srcinf text

 

harktool

Impulse response

PCM Wave

float binary

TF: Transfer Function, AM: Acoustic Model, LM: Language Model

The rest of the document describes the three formats in detail. We do not describe the standard formats: For Julius format including jconf format, see the document of Julius . For Raw Audio Format and PCM Wave Format, see the standard format description.