Learn R Programming

soundgen (version 1.7.0)

filterSoundByMS: Filter sound by modulation spectrum

Description

Manipulates the modulation spectrum (MS) of a sound so as to remove certain frequencies of amplitude modulation (AM) and frequency modulation (FM). Algorithm: produces a modulation spectrum with modulationSpectrum, modifies it with filterMS, converts the modified MS to a spectrogram with msToSpec, and finally inverts the spectrogram with invertSpectrogram, thus producing a sound with (approximately) the desired characteristics of the MS. Note that the last step of inverting the spectrogram introduces some noise, so the resulting MS is not precisely the same as the intermediate filtered version. In practice this means that some residual energy will still be present in the filtered-out frequency range (see examples).

Usage

filterSoundByMS(
  x,
  samplingRate = NULL,
  logSpec = FALSE,
  windowLength = 25,
  step = NULL,
  overlap = 80,
  wn = "hamming",
  zp = 0,
  amCond = NULL,
  fmCond = NULL,
  jointCond = NULL,
  action = c("remove", "preserve")[1],
  initialPhase = c("zero", "random", "spsi")[3],
  nIter = 50,
  play = FALSE,
  plot = TRUE,
  savePath = NA
)

Arguments

x

folder, path to a wav/mp3 file, a numeric vector representing a waveform, or a list of numeric vectors

samplingRate

sampling rate of x (only needed if x is a numeric vector, rather than an audio file). For a list of sounds, give either one samplingRate (the same for all) or as many values as there are input files

logSpec

if TRUE, the spectrogram is log-transformed prior to taking 2D FFT

windowLength

length of FFT window, ms

step

you can override overlap by specifying FFT step, ms

overlap

overlap between successive FFT frames, %

wn

window type: gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop

zp

window length after zero padding, points

amCond

character strings with valid conditions on amplitude and frequency modulation (see examples)

fmCond

character strings with valid conditions on amplitude and frequency modulation (see examples)

jointCond

character string with a valid joint condition amplitude and frequency modulation

action

should the defined AM-FM region be removed ('remove') or preserved, while everything else is removed ('preserve')?

initialPhase

initial phase estimate: "zero" = set all phases to zero; "random" = Gaussian noise; "spsi" (default) = single-pass spectrogram inversion (Beauregard et al., 2015)

nIter

the number of iterations of the GL algorithm (Griffin & Lim, 1984), 0 = don't run

play

if TRUE, plays back the output

plot

if TRUE, produces a triple plot: original MS, filtered MS, and the MS of the output sound

savePath

if a valid path is specified, a plot is saved in this folder (defaults to NA)

Value

Returns the filtered audio as a numeric vector normalized to [-1, 1] with the same sampling rate as input.

See Also

invertSpectrogram filterMS

Examples

Run this code
# NOT RUN {
# Create a sound to be filtered
samplingRate = 16000
s = soundgen(pitch = rnorm(n = 20, mean = 200, sd = 25),
  amFreq = 25, amDep = 50, samplingRate = samplingRate,
  addSilence = 50, plot = TRUE, osc = TRUE)
# playme(s, samplingRate)

# Filter
s_filt = filterSoundByMS(s, samplingRate = samplingRate,
  amCond = 'abs(am) > 15', fmCond = 'abs(fm) > 5',
  action = 'remove', nIter = 10)
# playme(s_filt, samplingRate)
# }
# NOT RUN {
# Download an example - a bit of speech (sampled at 16000 Hz)
download.file('http://cogsci.se/soundgen/audio/speechEx.wav',
              destfile = '~/Downloads/speechEx.wav')  # modify as needed
target = '~/Downloads/speechEx.wav'
samplingRate = tuneR::readWave(target)@samp.rate
playme(target, samplingRate)
spectrogram(target, samplingRate = samplingRate, osc = TRUE)

# Remove AM above 3 Hz from a bit of speech (remove most temporal details)
s_filt1 = filterSoundByMS(target, samplingRate = samplingRate,
  amCond = 'abs(am) > 3', nIter = 15)
playme(s_filt1, samplingRate)
spectrogram(s_filt1, samplingRate = samplingRate, osc = TRUE)

# Remove slow AM/FM (prosody) to achieve a "robotic" voice
s_filt2 = filterSoundByMS(target, samplingRate = samplingRate,
  jointCond = 'am^2 + (fm*3)^2 < 300', nIter = 15)
playme(s_filt2, samplingRate)

## An alternative manual workflow w/o calling filterSoundByMS()
# This way you can modify the MS directly and more flexibly
# than with the filterMS() function called by filterSoundByMS()

# (optional) Check that the target spectrogram can be successfully inverted
spec = spectrogram(s, samplingRate, windowLength = 25, overlap = 80,
  wn = 'hanning', osc = TRUE, padWithSilence = FALSE)
s_rev = invertSpectrogram(spec, samplingRate = samplingRate,
  windowLength = 25, overlap = 80, wn = 'hamming', play = FALSE)
# playme(s_rev, samplingRate)  # should be close to the original
spectrogram(s_rev, samplingRate, osc = TRUE)

# Get modulation spectrum starting from the sound...
ms = modulationSpectrum(s, samplingRate = samplingRate, windowLength = 25,
  overlap = 80, wn = 'hanning', maxDur = Inf, logSpec = FALSE,
  power = NA, returnComplex = TRUE, plot = FALSE)$complex
# ... or starting from the spectrogram:
# ms = specToMS(spec)
image(x = as.numeric(colnames(ms)), y = as.numeric(rownames(ms)),
  z = t(log(abs(ms))))  # this is the original MS

# Filter as needed - for ex., remove AM > 10 Hz and FM > 3 cycles/kHz
# (removes f0, preserves formants)
am = as.numeric(colnames(ms))
fm = as.numeric(rownames(ms))
idx_row = which(abs(fm) > 3)
idx_col = which(abs(am) > 10)
ms_filt = ms
ms_filt[idx_row, ] = 0
ms_filt[, idx_col] = 0
image(x = as.numeric(colnames(ms_filt)), y = as.numeric(rownames(ms_filt)),
  t(log(abs(ms_filt))))  # this is the filtered MS

# Convert back to a spectrogram
spec_filt = msToSpec(ms_filt)
image(t(log(abs(spec_filt))))

# Invert the spectrogram
s_filt = invertSpectrogram(abs(spec_filt), samplingRate = samplingRate,
  windowLength = 25, overlap = 80, wn = 'hanning')
# NB: use the same settings as in "spec = spectrogram(s, ...)" above

# Compare with the original
playme(s, samplingRate)
spectrogram(s, samplingRate, osc = TRUE)
playme(s_filt, samplingRate)
spectrogram(s_filt, samplingRate, osc = TRUE)

ms_new = modulationSpectrum(s_filt, samplingRate = samplingRate,
  windowLength = 25, overlap = 80, wn = 'hanning', maxDur = Inf,
  plot = FALSE, returnComplex = TRUE)$complex
image(x = as.numeric(colnames(ms_new)), y = as.numeric(rownames(ms_new)),
  z = t(log(abs(ms_new))))
plot(as.numeric(colnames(ms)), log(abs(ms[nrow(ms) / 2, ])), type = 'l')
points(as.numeric(colnames(ms_new)), log(ms_new[nrow(ms_new) / 2, ]), type = 'l',
  col = 'red', lty = 3)
# AM peaks at 25 Hz are removed, but inverting the spectrogram adds a bit of noise
# }

Run the code above in your browser using DataLab