audSpectrogram: Auditory spectrogram

Description

Produces an auditory spectrogram by extracting a bank of bandpass filters (work in progress). While tuneR::audspec is based on FFT, here we convolve the sound with a bank of filters. The main difference is that we don't window the signal and de factor get variable temporal resolution in different frequency channels, as with a wavelet transform. The filters are currently third-order Butterworth bandpass filters implemented in butter.

Usage

audSpectrogram(
  x,
  samplingRate = NULL,
  scale = NULL,
  from = NULL,
  to = NULL,
  step = 1,
  dynamicRange = 80,
  nFilters = 128,
  minFreq = 20,
  maxFreq = samplingRate/2,
  minBandwidth = 10,
  reportEvery = NULL,
  plot = TRUE,
  savePlots = NULL,
  osc = c("none", "linear", "dB")[2],
  heights = c(3, 1),
  ylim = NULL,
  yScale = c("bark", "mel", "log")[1],
  contrast = 0.2,
  brightness = 0,
  maxPoints = c(1e+05, 5e+05),
  padWithSilence = TRUE,
  colorTheme = c("bw", "seewave", "heat.colors", "...")[1],
  extraContour = NULL,
  xlab = NULL,
  ylab = NULL,
  xaxp = NULL,
  mar = c(5.1, 4.1, 4.1, 2),
  main = NULL,
  grid = NULL,
  width = 900,
  height = 500,
  units = "px",
  res = NA,
  ...
)

Arguments

path to a folder, one or more wav or mp3 files c('file1.wav', 'file2.mp3'), Wave object, numeric vector, or a list of Wave objects or numeric vectors

samplingRate

sampling rate of x (only needed if x is a numeric vector)

scale

maximum possible amplitude of input used for normalization of input vector (only needed if x is a numeric vector)

from

if NULL (default), analyzes the whole sound, otherwise from...to (s)

step

step, ms (determines time resolution). step = NULL means no downsampling at all (ncol of output = length of input audio)

dynamicRange

dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero

nFilters

the number of filters (determines frequency resolution)

minFreq, maxFreq

the range of frequencies to analyze

minBandwidth

minimum filter bandwidth, Hz (otherwise filters may become too narrow when nFilters is high)

reportEvery

when processing multiple inputs, report estimated time left every ... iterations (NULL = default, NA = don't report)

plot

should a spectrogram be plotted? TRUE / FALSE

savePlots

full path to the folder in which to save the plots (NULL = don't save, '' = same folder as audio)

osc

"none" = no oscillogram; "linear" = on the original scale; "dB" = in decibels

heights

a vector of length two specifying the relative height of the spectrogram and the oscillogram (including time axes labels)

ylim

frequency range to plot, kHz (defaults to 0 to Nyquist frequency). NB: still in kHz, even if yScale = bark or mel

yScale

scale of the frequency axis: 'linear' = linear, 'log' = logarithmic (musical), 'bark' = bark with hz2bark, 'mel' = mel with hz2mel

contrast

spectrum is exponentiated by contrast (any real number, recommended -1 to +1). Contrast >0 increases sharpness, <0 decreases sharpness

brightness

how much to "lighten" the image (>0 = lighter, <0 = darker)

maxPoints

the maximum number of "pixels" in the oscillogram (if any) and spectrogram; good for quickly plotting long audio files; defaults to c(1e5, 5e5)

padWithSilence

if TRUE, pads the sound with just enough silence to resolve the edges properly (only the original region is plotted, so the apparent duration doesn't change)

colorTheme

black and white ('bw'), as in seewave package ('seewave'), or any palette from palette such as 'heat.colors', 'cm.colors', etc

extraContour

a vector of arbitrary length scaled in Hz (regardless of yScale!) that will be plotted over the spectrogram (eg pitch contour); can also be a list with extra graphical parameters such as lwd, col, etc. (see examples)

xlab