Internal soundgen function.
analyzeFrame(
frame,
autoCorrelation = NULL,
samplingRate = 44100,
scaleCorrection = 1,
trackPitch = TRUE,
pitchMethods = c("autocor", "cep", "spec", "dom"),
cutFreq = 6000,
domThres = 0.1,
domSmooth = 220,
autocorThres = 0.75,
autocorSmooth = NULL,
cepThres = 0.45,
cepSmooth = 3,
cepZp = 2^13,
specThres = 0.45,
specPeak = 0.8,
specSinglePeakCert = 0.6,
specSmooth = 100,
specHNRslope = 0.1,
specMerge = 1,
pitchFloor = 75,
pitchCeiling = 3500,
nCands = 1
)
the real part of the spectrum of a frame, as returned by
fft
pre-calculated autocorrelation of the input frame (computationally more efficient than to do it here)
sampling rate (Hz)
if TRUE, attempt to find F0 in this frame (FALSE if entropy
is above some threshold - specified in analyze
)
methods of pitch estimation to consider for determining pitch contour: 'autocor' = autocorrelation (~PRAAT), 'cep' = cepstral, 'spec' = spectral (~BaNa), 'dom' = lowest dominant frequency band ('' or NULL = no pitch analysis)
(2 * pitchCeiling to Nyquist, Hz) repeat the calculation of
spectral descriptives after discarding all info above cutFreq
.
Recommended if the original sampling rate varies across different analyzed
audio files. Note that "entropyThres" applies only to this frequency range,
which also affects which frames will not be analyzed with pitchAutocor.
(0 to 1) to find the lowest dominant frequency band, we do short-term FFT and take the lowest frequency with amplitude at least domThres
the width of smoothing interval (Hz) for finding
dom
(0 to 1) separate voicing thresholds for detecting pitch candidates with three different methods: autocorrelation, cepstrum, and BaNa algorithm (see Details). Note that HNR is calculated even for unvoiced frames.
the width of smoothing interval (in bins) for finding peaks in the autocorrelation function. Defaults to 7 for sampling rate 44100 and smaller odd numbers for lower values of sampling rate
(0 to 1) separate voicing thresholds for detecting pitch candidates with three different methods: autocorrelation, cepstrum, and BaNa algorithm (see Details). Note that HNR is calculated even for unvoiced frames.
the width of smoothing interval (Hz) for finding peaks in the cepstrum
zero-padding of the spectrum used for cepstral pitch detection (final length of spectrum after zero-padding in points, e.g. 2 ^ 13)
(0 to 1) separate voicing thresholds for detecting pitch candidates with three different methods: autocorrelation, cepstrum, and BaNa algorithm (see Details). Note that HNR is calculated even for unvoiced frames.
when looking for putative harmonics in
the spectrum, the threshold for peak detection is calculated as
specPeak * (1 - HNR * specHNRslope)
(0 to 1) if F0 is calculated based on a single
harmonic ratio (as opposed to several ratios converging on the same
candidate), its certainty is taken to be specSinglePeakCert
the width of window for detecting peaks in the spectrum, Hz
when looking for putative harmonics in
the spectrum, the threshold for peak detection is calculated as
specPeak * (1 - HNR * specHNRslope)
pitch candidates within specMerge
semitones are
merged with boosted certainty
absolute bounds for pitch candidates (Hz)
absolute bounds for pitch candidates (Hz)
maximum number of pitch candidates per method (except for
dom
, which returns at most one candidate per frame), normally 1...4
Returns a list with two components: $pitchCands_frame contains pitch candidates for the frame, and $summaries contains other acoustic predictors like HNR, specSlope, etc.
This function performs the heavy lifting of pitch tracking and acoustic analysis in general: it takes the spectrum of a single fft frame as input and analyzes it.