NAME: Barkfib

SYNOPSIS: Barkfib [options] audiofile

DESCRIPTION

Extract filterbank features from an audio file. Bark, mel or linear scaling can be chosen. The filters are implemented using triangular windows on the short-time FFT amplitude spectrum. By setting the "-m" switch, and selecting HTK output format with "-O htk", parameter files compatible with the HTK toolkit can be generated.

OPTIONS

-S This option changes the interpretation of the argument "audiofile". With the -S option specified, the audio data is taken from files from the rows of the script file "audiofile". Otherwize "audiofile" is itself a filename.

-T level Level of output printed to stdout. 'level' is 1, 2 or 3 and the default is 0.

-F format Specifies audio file format. See the Streams reference section for a list of audio file formats.

-x ext Sets the file extension for output feature files. The default extension is: "fib".

-d ext Sets the directory for output feature files. The default is the current directory.

-q ext Specifies the file extension of input files (no default value).

-p dir Specifies the directory of input files (no default value).

-O format Sets the output file format of the extracted parameter files. The default format is "binary". See the Streams reference section for a list of parameter file formats.

-c N Output cepstrum coefficients instead of the filterbank activities. The default is to start with the 1^st cepstrum coefficient. But the zeroth coefficient can be outputted if the -0 option is specified.

-0 Include the 0^th cepstrum coefficient. This option can only be used together with the -c option.

-L lift Apply cepstrum liftering to the cepstrum coefficients with lifting parameter 'lift'. effective only together with the -c option.

-m Use Mel scale instead of the default Bark scale.

-r tau Subtract average mean cepstrum from the parameters. "tau" is a parameter controlling the time constant of the moving average computation. Reasonable values are between zero and one.

Options specifying the filterbank characteristics

-e Add log energy as the last parameter of the feature vector.

-n num_filters Specifies the number of filters. The default is 16.

-R floor ceiling Specifies the lower and upper cut frequencies of the filterbank. The default is 0-8000 Hz, i.e., 'c' = 0 and 'f' = 8000 .

-P c Apply pre emphasis to the input speech data with coefficient 'c'. The default is c=0.97.

-l length Set frame length in ms. The default is 10 ms.

-w length Set analyze window length in ms. By experience we know that this length should be longer than the frame length. The default is 25 ms.

Options used to complement/override the file header information

-h size Specifies the size of the audio file header.

-f sample-freq Specify the sample frequency of the input audio data. The default is 16kHz if not specified in the file header.

-b Swap the byte order of the input data.

SEE ALSO

Cepitch, MakeCep, MakeDiff

KNOWN BUGS

The "-a" switch is still experimental.

-S	This option changes the interpretation of the argument "audiofile". With the -S option specified, the audio data is taken from files from the rows of the script file "audiofile". Otherwize "audiofile" is itself a filename.
-T level	Level of output printed to stdout. 'level' is 1, 2 or 3 and the default is 0.
-F format	Specifies audio file format. See the Streams reference section for a list of audio file formats.
-x ext	Sets the file extension for output feature files. The default extension is: "fib".
-d ext	Sets the directory for output feature files. The default is the current directory.
-q ext	Specifies the file extension of input files (no default value).
-p dir	Specifies the directory of input files (no default value).
-O format	Sets the output file format of the extracted parameter files. The default format is "binary". See the Streams reference section for a list of parameter file formats.
-c N	Output cepstrum coefficients instead of the filterbank activities. The default is to start with the 1^st cepstrum coefficient. But the zeroth coefficient can be outputted if the -0 option is specified.
-0	Include the 0^th cepstrum coefficient. This option can only be used together with the -c option.
-L lift	Apply cepstrum liftering to the cepstrum coefficients with lifting parameter 'lift'. effective only together with the -c option.
-m	Use Mel scale instead of the default Bark scale.
-r tau	Subtract average mean cepstrum from the parameters. "tau" is a parameter controlling the time constant of the moving average computation. Reasonable values are between zero and one.

-e	Add log energy as the last parameter of the feature vector.
-n num_filters	Specifies the number of filters. The default is 16.
-R floor ceiling	Specifies the lower and upper cut frequencies of the filterbank. The default is 0-8000 Hz, i.e., 'c' = 0 and 'f' = 8000 .
-P c	Apply pre emphasis to the input speech data with coefficient 'c'. The default is c=0.97.
-l length	Set frame length in ms. The default is 10 ms.
-w length	Set analyze window length in ms. By experience we know that this length should be longer than the frame length. The default is 25 ms.

-h size	Specifies the size of the audio file header.
-f sample-freq	Specify the sample frequency of the input audio data. The default is 16kHz if not specified in the file header.
-b	Swap the byte order of the input data.