getPitchZc: Zero-crossing rate

Description

A less precise, but very quick method of pitch tracking based on measuring zero-crossing rate in low-pass-filtered audio. Recommended for processing long recordings with typical pitch values well below the first formant frequency, such as speech. Calling this function is considerably faster than using the same pitch-tracking method in analyze. Note that, unlike analyze(), it returns the times of individual zero crossings (hopefully corresponding to glottal cycles) instead of pitch values at fixed time intervals.

Usage

getPitchZc(
  x,
  samplingRate = NULL,
  scale = NULL,
  from = NULL,
  to = NULL,
  pitchFloor = 50,
  pitchCeiling = 400,
  zcThres = 0.1,
  zcWin = 5,
  silence = 0.04,
  envWin = 5,
  summaryFun = c("mean", "sd"),
  reportEvery = NULL
)

Arguments

path to a folder, one or more wav or mp3 files c('file1.wav', 'file2.mp3'), Wave object, numeric vector, or a list of Wave objects or numeric vectors

samplingRate

sampling rate of x (only needed if x is a numeric vector)

scale

maximum possible amplitude of input used for normalization of input vector (only needed if x is a numeric vector)

from

if NULL (default), analyzes the whole sound, otherwise from...to (s)

pitchFloor

absolute bounds for pitch candidates (Hz)

pitchCeiling

absolute bounds for pitch candidates (Hz)

zcThres

pitch candidates with certainty below this value are treated as noise and set to NA (0 = anything goes, 1 = pitch must be perfectly stable over zcWin)

zcWin

certainty in pitch candidates depends on how stable pitch is over zcWin glottal cycles (odd integer > 3)

silence

minimum root mean square (RMS) amplitude, below which pitch candidates are set to NA (NULL = don't consider RMS amplitude)

envWin

window length for calculating RMS envelope, ms

summaryFun

functions used to summarize each acoustic characteristic; see analyze

reportEvery

when processing multiple inputs, report estimated time left every ... iterations (NULL = default, NA = don't report)

Value

Returns a dataframe containing

pitch: pitch calculated from the time between consecutive zero crossings
cert: certainty in each pitch candidate calculated from local pitch stability, 0 to 1

Details

Algorithm: the audio is bandpass-filtered from pitchFloor to pitchCeiling, and the timing of all zero crossings is saved. This is not enough, however, because unvoiced sounds like white noise also have plenty of zero crossings. Accordingly, an attempt is made to detect voiced segments (or steady musical tones, etc.) by looking for stable regions, with several zero-crossings at relatively regular intervals (see parameters zcThres and zcWin). Very quiet parts of audio are also treated as not having a pitch.

Examples

Run this code

# NOT RUN {
data(sheep, package = 'seewave')
# spectrogram(sheep)
zc = getPitchZc(sheep, pitchCeiling = 250)
plot(zc$detailed[, c('time', 'pitch')], type = 'b')

# Convert to a standard pitch contour sampled at regular time intervals:
pitch = getSmoothContour(
  anchors = data.frame(time = zc$detailed$time, value = zc$detailed$pitch),
  len = 1000, NA_to_zero = FALSE, discontThres = 0)
spectrogram(sheep, extraContour = pitch, ylim = c(0, 2))

# }
# NOT RUN {
# process all files in a folder
zc = getPitchZc('~/Downloads/temp')
zc$summary
# }