.spectrogram: Spectrogram per sound

Description

Internal soundgen function called by spectrogram and analyze.

Usage

.spectrogram(
  audio,
  dynamicRange = 80,
  windowLength = 50,
  step = windowLength/2,
  overlap = NULL,
  specType = c("spectrum", "reassigned", "spectralDerivative")[1],
  logSpec = TRUE,
  rasterize = FALSE,
  wn = "gaussian",
  zp = 0,
  normalize = TRUE,
  smoothFreq = 0,
  smoothTime = 0,
  qTime = 0,
  percentNoise = 10,
  noiseReduction = 0,
  output = c("original", "processed", "complex", "all")[1],
  specManual = NULL,
  plot = TRUE,
  osc = c("none", "linear", "dB")[2],
  heights = c(3, 1),
  ylim = NULL,
  yScale = "linear",
  contrast = 0.2,
  brightness = 0,
  blur = 0,
  maxPoints = c(1e+05, 5e+05),
  padWithSilence = TRUE,
  colorTheme = c("bw", "seewave", "heat.colors", "...")[1],
  col = NULL,
  extraContour = NULL,
  xlab = NULL,
  ylab = NULL,
  xaxp = NULL,
  mar = c(5.1, 4.1, 4.1, 2),
  main = NULL,
  grid = NULL,
  width = 900,
  height = 500,
  units = "px",
  res = NA,
  internal = NULL,
  ...
)

Arguments

audio: a list returned by readAudio
dynamicRange: dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero
windowLength: length of FFT window, ms
step: you can override overlap by specifying FFT step, ms (NB: because digital audio is sampled at discrete time intervals of 1/samplingRate, the actual step and thus the time stamps of STFT frames may be slightly different, eg 24.98866 instead of 25.0 ms)
overlap: overlap between successive FFT frames, %
specType: plot the original FFT ('spectrum'), reassigned spectrogram ('reassigned'), or spectral derivative ('spectralDerivative')
logSpec: if TRUE, log-transforms the spectrogram
rasterize: (only applies if specType = 'reassigned') if TRUE, the reassigned spectrogram is plotted after rasterizing it: that is, showing density per time-frequency bins with the same resolution as an ordinary spectrogram
wn: window type accepted by ftwindow, currently gaussian, hanning, hamming, bartlett, blackman, flattop, rectangle
zp: window length after zero padding, points
normalize: if TRUE, scales input prior to FFT
smoothFreq, smoothTime: length of the window for median smoothing in frequency and time domains, respectively, points
qTime: the quantile to be subtracted for each frequency bin. For ex., if qTime = 0.5, the median of each frequency bin (over the entire sound duration) will be calculated and subtracted from each frame (see examples)
percentNoise: percentage of frames (0 to 100%) used for calculating noise spectrum
noiseReduction: how much noise to remove (non-negative number, recommended 0 to 2). 0 = no noise reduction, 2 = strong noise reduction: \(spectrum - (noiseReduction * noiseSpectrum)\), where noiseSpectrum is the average spectrum of frames with entropy exceeding the quantile set by percentNoise
output: specifies what to return: nothing ('none'), unmodified spectrogram ('original'), denoised and/or smoothed spectrogram ('processed'), or unmodified spectrogram with the imaginary part giving phase ('complex')
specManual: manually calculated spectrogram-like representation in the same format as the output of spectrogram(): rows = frequency in kHz, columns = time in ms
plot: should a spectrogram be plotted? TRUE / FALSE
osc: "none" = no oscillogram; "linear" = on the original scale; "dB" = in decibels
heights: a vector of length two specifying the relative height of the spectrogram and the oscillogram (including time axes labels)
ylim: frequency range to plot, kHz (defaults to 0 to Nyquist frequency). NB: still in kHz, even if yScale = bark, mel, or ERB
yScale: scale of the frequency axis: 'linear' = linear, 'log' = logarithmic (musical), 'bark' = bark with hz2bark, 'mel' = mel with hz2mel, 'ERB' = Equivalent Rectangular Bandwidths with HzToERB
contrast: a number, recommended range -1 to +1. The spectrogram is raised to the power of exp(3 * contrast). Contrast >0 increases sharpness, <0 decreases sharpness
brightness: how much to "lighten" the image (>0 = lighter, <0 = darker)
blur: apply a Gaussian filter to blur or sharpen the image, two numbers: frequency (Hz), time (ms). A single number is interpreted as frequency, and a square filter is applied. NA / NULL / 0 means no blurring in that dimension. Negative numbers mean un-blurring (sharpening) the image by dividing instead of multiplying by the filter during convolution
maxPoints: the maximum number of "pixels" in the oscillogram (if any) and spectrogram; good for quickly plotting long audio files; defaults to c(1e5, 5e5)
padWithSilence: if TRUE, pads the sound with just enough silence to resolve the edges properly (only the original region is plotted, so the apparent duration doesn't change)
colorTheme: black and white ('bw'), as in seewave package ('seewave'), matlab-type palette ('matlab'), or any palette from palette such as 'heat.colors', 'cm.colors', etc
col: actual colors, eg rev(rainbow(100)) - see ?hcl.colors for colors in base R (overrides colorTheme)
extraContour: a vector of arbitrary length scaled in Hz (regardless of yScale!) that will be plotted over the spectrogram (eg pitch contour); can also be a list with extra graphical parameters such as lwd, col, etc. (see examples)
xlab, ylab, main, mar, xaxp: graphical parameters for plotting
grid: if numeric, adds n = grid dotted lines per kHz
width, height, units, res: graphical parameters for saving plots passed to png
internal: a long list of stuff for plotting pitch contours passed by analyze()
...: other graphical parameters