generateHarmonics: Generate harmonics

Description

Internal soundgen function.

Usage

generateHarmonics(
  pitch,
  glottis = 0,
  attackLen = 50,
  nonlinBalance = 0,
  nonlinDep = 50,
  nonlinRandomWalk = NULL,
  jitterDep = 0,
  jitterLen = 1,
  vibratoFreq = 5,
  vibratoDep = 0,
  shimmerDep = 0,
  shimmerLen = 1,
  creakyBreathy = 0,
  rolloff = -9,
  rolloffOct = 0,
  rolloffKHz = 0,
  rolloffParab = 0,
  rolloffParabHarm = 3,
  rolloff_perAmpl = 0,
  rolloffExact = NULL,
  temperature = 0.025,
  pitchDriftDep = 0.5,
  pitchDriftFreq = 0.125,
  amplDriftDep = 1,
  subDriftDep = 4,
  rolloffDriftDep = 3,
  randomWalk_trendStrength = 0.5,
  shortestEpoch = 300,
  subFreq = 100,
  subDep = 0,
  ampl = NA,
  normalize = TRUE,
  interpol = c("approx", "spline", "loess")[3],
  overlap = 75,
  samplingRate = 16000,
  pitchFloor = 75,
  pitchCeiling = 3500,
  pitchSamplingRate = 3500,
  dynamicRange = 80
)

Arguments

pitch

a contour of fundamental frequency (numeric vector). NB: for computational efficiency, provide the pitch contour at a reduced sampling rate pitchSamplingRate, eg 3500 points/s. The pitch contour will be upsampled before synthesis.

glottis

anchors for specifying the proportion of a glottal cycle with closed glottis, % (0 = no modification, 100 = closed phase as long as open phase); numeric vector or dataframe specifying time and value (anchor format)

attackLen

duration of fade-in / fade-out at each end of syllables and noise (ms): a vector of length 1 (symmetric) or 2 (separately for fade-in and fade-out)

nonlinBalance

hyperparameter for regulating the (approximate) proportion of sound with different regimes of pitch effects (none / subharmonics only / subharmonics and jitter). 0% = no noise; 100% = the entire sound has jitter + subharmonics. Ignored if temperature = 0

nonlinDep

hyperparameter for regulating the intensity of subharmonics and jitter, 0 to 100% (50% = jitter and subharmonics are as specified, <50% weaker, >50% stronger). Ignored if temperature = 0

nonlinRandomWalk

a numeric vector specifying the timing of nonliner regimes: 0 = none, 1 = subharmonics, 2 = subharmonics + jitter + shimmer

jitterDep

cycle-to-cycle random pitch variation, semitones (anchor format)

jitterLen

duration of stable periods between pitch jumps, ms. Use a low value for harsh noise, a high value for irregular vibrato or shaky voice (anchor format)

vibratoFreq

the rate of regular pitch modulation, or vibrato, Hz (anchor format)

vibratoDep

the depth of vibrato, semitones (anchor format)

shimmerDep

random variation in amplitude between individual glottal cycles (0 to 100% of original amplitude of each cycle) (anchor format)

shimmerLen

duration of stable periods between amplitude jumps, ms. Use a low value for harsh noise, a high value for shaky voice (anchor format)

creakyBreathy

hyperparameter for a rough adjustment of voice quality from creaky (-1) to breathy (+1); 0 = no change

rolloff

basic rolloff from lower to upper harmonics, db/octave (exponential decay). All rolloff parameters are in anchor format. See getRolloff for more details

rolloffOct

basic rolloff changes from lower to upper harmonics (regardless of f0) by rolloffOct dB/oct. For example, we can get steeper rolloff in the upper part of the spectrum

rolloffKHz

rolloff changes linearly with f0 by rolloffKHz dB/kHz. For ex., -6 dB/kHz gives a 6 dB steeper basic rolloff as f0 goes up by 1000 Hz

rolloffParab

an optional quadratic term affecting only the first rolloffParabHarm harmonics. The middle harmonic of the first rolloffParabHarm harmonics is amplified or dampened by rolloffParab dB relative to the basic exponential decay

rolloffParabHarm

the number of harmonics affected by rolloffParab

rolloff_perAmpl

as amplitude goes down from max to -dynamicRange, rolloff increases by rolloff_perAmpl dB/octave. The effect is to make loud parts brighter by increasing energy in higher frequencies

rolloffExact

user-specified exact strength of harmonics: a vector or matrix with one row per harmonic, scale 0 to 1 (overrides all other rolloff parameters)

temperature

hyperparameter for regulating the amount of stochasticity in sound generation

pitchDriftDep

scale factor regulating the effect of temperature on the amount of slow random drift of f0 (like jitter, but slower): the higher, the more f0 "wiggles" at a given temperature

pitchDriftFreq

scale factor regulating the effect of temperature on the frequency of random drift of f0 (like jitter, but slower): the higher, the faster f0 "wiggles" at a given temperature

randomWalk_trendStrength

try 0 to 1 - the higher, the more likely rw is to get high in the middle and low at the beginning and end (i.e. max effect amplitude in the middle of a sound)

shortestEpoch

minimum duration of each epoch with unchanging subharmonics regime, in ms

subFreq

target frequency of subharmonics, Hz (lower than f0, adjusted dynamically so f0 is always a multiple of subFreq) (anchor format)

subDep

the width of subharmonic band, Hz. Regulates how quickly the strength of subharmonics fades as they move away from harmonics in f0 stack (anchor format)

ampl

amplitude envelope (dB, 0 = max amplitude) (anchor format)

normalize

if TRUE, normalizes to -1...+1 prior to applying attack and amplitude envelope. W/o this, sounds with stronger harmonics are louder

interpol

the method of smoothing envelopes based on provided anchors: 'approx' = linear interpolation, 'spline' = cubic spline, 'loess' (default) = polynomial local smoothing function. NB: this does not affect contours for "noise", "glottal", and the smoothing of formants

overlap

FFT window overlap, %. For allowed values, see istft

samplingRate

sampling frequency, Hz

pitchFloor

lower & upper bounds of f0

pitchCeiling

lower & upper bounds of f0

pitchSamplingRate

sampling frequency of the pitch contour only, Hz. Low values reduce processing time. Set to pitchCeiling for optimal speed or to samplingRate for optimal quality

dynamicRange

dynamic range, dB. Harmonics and noise more than dynamicRange under maximum amplitude are discarded to save computational resources

Details

Returns one continuous, unfiltered, voiced syllable consisting of several sine waves.

Examples

Run this code

# NOT RUN {
rolloffExact1 = c(.2, .2, 1, .2, .2)
s1 = soundgen:::generateHarmonics(pitch = seq(400, 530, length.out = 1500),
                       rolloffExact = rolloffExact1)
spectrogram(s1, 16000, ylim = c(0, 4))
# playme(s1, 16000)

rolloffExact2 = matrix(c(.2, .2, 1, .2, .2,
                         1, .5, .2, .1, .05), ncol = 2)
s2 = soundgen:::generateHarmonics(pitch = seq(400, 530, length.out = 1500),
                       rolloffExact = rolloffExact2)
spectrogram(s2, 16000, ylim = c(0, 4))
# playme(s2, 16000)
# }

Run the code above in your browser using DataLab