Learn R Programming

tuneR (version 0.2-6)

MFCC: Mel Frequency Cepstral Coefficients

Description

Computation of MFCCs (Mel Frequency Cepstral Coefficients) for a Wave object.

Usage

MFCC(object, a = 0.1, HW.width = 0.025, HW.overlapping = 0.25, 
    T.number = 24, T.overlapping = 0.5, K = 12)

Arguments

object
Object of class Wave.
a
Coefficient for a first oder diffenrence filter, which is used to pre-emphasize the signal in first step of feature extraction.
HW.width
Width of Hamming window in seconds, which is used to divide the signal into frames.
HW.overlapping
Fraction of how much the Hamming windows should overlap.
T.number
Number of triangular channels on the mel scaled spectrum, which are mapped to the signal.
T.overlapping
Fraction of how much the triangular filters should overlap.
K
Number of desired output quefrencies the inverse discrete cosine transformation.

Value

  • A matrix (number of Hamming windows)-rows and K+1 columns. The first columns is the energy, the follwing K columns the extracted MFCC features.

concept

  • MFCC
  • Mel
  • Cepstrum

Details

This function computes Mel Frequency Cepstral Coefficients (MFCC) for an object of class Wave. In speech recognition MFCCs are used to extract the stimulus of the vocal tract from speech. The process to create the MFCC features consist of five steps. First the signal from object is filtered with a finite impulse response (FIR) filter to pre-amplify high frequencies. Only the left channel of object, i.e. a mono signal, is used for the extraction. The parameter a controls the FIR filter. The filtered signal $S.fil$ at time $t$ is obtained by $S.fil(t) = S(t) - a*S(t-1)$. In a second step the signal is converted to frames, each of length HW.width. A Hamming window is used to avoid any negative effects on the edges of each frame due to the conversion. After a discrete Fourier transformation (DFT) the signal is mapped to the Mel scale filter bank. The filter bank consists of T.number triangular filters, which overlap by T.overlapping. This performs a perceptual weighting of frequeies. In a last step an inverse discrete cosine transformation is applied to the signal. K controls the order, up to which MFCC features are computed.

References

Young, S., Everman, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P. (2005): The HTK-Book (v 3.3), Cambridge University Engineering Dept., 59-61.

See Also

Wave

Examples

Run this code
obj <- sine(440, bit = 16, duration = 5000)
MFCC(obj)

Run the code above in your browser using DataLab