Wave
object.MFCC(object, a = 0.1, HW.width = 0.025, HW.overlapping = 0.25,
T.number = 24, T.overlapping = 0.5, K = 12)
Wave
.K+1
columns.
The first columns is the energy, the follwing K
columns the extracted MFCC features.Wave
.
In speech recognition MFCCs are used to extract the stimulus of the vocal tract from speech.
The process to create the MFCC features consist of five steps.
First the signal from object
is filtered with a finite impulse response (FIR) filter to pre-amplify high frequencies.
Only the left channel of object
, i.e. a mono signal, is used for the extraction.
The parameter a
controls the FIR filter.
The filtered signal $S.fil$ at time $t$ is obtained by $S.fil(t) = S(t) - a*S(t-1)$.
In a second step the signal is converted to frames, each of length HW.width
.
A Hamming window is used to avoid any negative effects on the edges of each frame due to the conversion.
After a discrete Fourier transformation (DFT) the signal is mapped to the Mel scale filter bank.
The filter bank consists of T.number
triangular filters, which overlap by T.overlapping
.
This performs a perceptual weighting of frequeies.
In a last step an inverse discrete cosine transformation is applied to the signal.
K
controls the order, up to which MFCC features are computed.Wave
obj <- sine(440, bit = 16, duration = 5000)
MFCC(obj)
Run the code above in your browser using DataLab