SYNOPSIS: Barkfib [options] audiofile
DESCRIPTION
Extract filterbank features from an audio file. Bark, mel or linear scaling can be chosen. The filters are implemented using triangular windows on the short-time FFT amplitude spectrum. By setting the "-m" switch, and selecting HTK output format with "-O htk", parameter files compatible with the HTK toolkit can be generated.
OPTIONS
-S | This option changes the interpretation of the argument "audiofile". With the -S option specified, the audio data is taken from files from the rows of the script file "audiofile". Otherwize "audiofile" is itself a filename. |
-T level | Level of output printed to stdout. 'level' is 1, 2 or 3 and the default is 0. |
-F format | Specifies audio file format. See the Streams reference section for a list of audio file formats. |
-x ext | Sets the file extension for output feature files. The default extension is: "fib". |
-d ext | Sets the directory for output feature files. The default is the current directory. |
-q ext | Specifies the file extension of input files (no default value). |
-p dir | Specifies the directory of input files (no default value). |
-O format | Sets the output file format of the extracted parameter files. The default format is "binary". See the Streams reference section for a list of parameter file formats. |
-c N | Output cepstrum coefficients instead of the filterbank activities. The default is to start with the 1st cepstrum coefficient. But the zeroth coefficient can be outputted if the -0 option is specified. |
-0 | Include the 0th cepstrum coefficient. This option can only be used together with the -c option. |
-L lift | Apply cepstrum liftering to the cepstrum coefficients with lifting parameter 'lift'. effective only together with the -c option. |
-m | Use Mel scale instead of the default Bark scale. |
-r tau | Subtract average mean cepstrum from the parameters. "tau" is a parameter controlling the time constant of the moving average computation. Reasonable values are between zero and one. |
-e | Add log energy as the last parameter of the feature vector. |
-n num_filters | Specifies the number of filters. The default is 16. |
-R floor ceiling | Specifies the lower and upper cut frequencies of the filterbank. The default is 0-8000 Hz, i.e., 'c' = 0 and 'f' = 8000 . |
-P c | Apply pre emphasis to the input speech data with coefficient 'c'. The default is c=0.97. |
-l length | Set frame length in ms. The default is 10 ms. |
-w length | Set analyze window length in ms. By experience we know that this length should be longer than the frame length. The default is 25 ms. |
-h size | Specifies the size of the audio file header. |
-f sample-freq | Specify the sample frequency of the input audio data. The default is 16kHz if not specified in the file header. |
-b | Swap the byte order of the input data. |
KNOWN BUGS
The "-a" switch is still experimental.