optimize_auto_detec: Optimize the detection of signals based on a-priori detections

Description

Optimize the detection of signals based on a-priori detections

Usage

optimize_auto_detec(X, Y, threshold = 10, power = 1, wl = 512, ssmooth = 0, 
hold.time = 0, mindur = NULL, maxdur = NULL, thinning = 1, parallel = 1, 
pb = TRUE, by.sound.file = FALSE, bp = NULL, path = NULL, previous.output = NULL)

Value

A data frame in which each row shows the result of a detection job with a particular combination of tuning parameters (including in the data frame). It also includes the following diagnostic metrics:

true.positives: number of detections that correspond to signals referenced in 'X'. Matching is defined as some degree of overlap in time. In a perfect detection routine it should be equal to the number of rows in 'X'.
false.positives: number of detections that don't match any of the signals referenced in 'X'. In a perfect detection routine it should be 0.
false.negatives: number of signals in 'reference' that were not detected (not found in 'detection'. In a perfect detection routine it should be 0.
split.positives: number of signals referenced in 'X' that were overlapped by more than 1 detection (i.e. detections that were split). In a perfect detection routine it should be 0.
mean.duration.true.positives: mean duration of true positives (in s).
mean.duration.false.positives: mean duration of false positives (in s).
mean.duration.false.negatives: mean duration of false negatives (in s). Only included when time.diagnostics = TRUE.
proportional.duration.true.positives: ratio of total duration of true positives to the total duration of signals referenced in 'X'. In a perfect detection routine it should be 1.
sensitivity: Proportion of signals referenced in 'X' that were detected. In a perfect detection routine it should be 1.
specificity: Proportion of detections that correspond to signals referenced in 'X' that were detected. In a perfect detection routine it should be 1.

Arguments

X: 'selection_table' object or a data frame with columns for sound file name (sound.files), selection number (selec), and start and end time of signal (start and end). It should contain the selections that will be used for detection optimization.
Y: Optional.An object of class 'autodetec.output' (generated by auto_detec) in which to optimize detections. Must refer to the same sound files as in 'X'. Default is (NULL).
threshold: A numeric vector specifying the amplitude threshold for detecting signals (in %). Several values can be supplied for optimization.
power: A numeric vector indicating a power factor applied to the amplitude envelope. Increasing power will reduce low amplitude modulations and increase high amplitude modulations, in order to reduce background noise. Default is 1 (no change). Several values can be supplied for optimization.
wl: A numeric vector of length 1 specifying the window used internally by ffilter for bandpass filtering (so only applied when 'bp' is supplied). Default is 512.
ssmooth: A numeric vector to smooth the amplitude envelope with a sum smooth function. Default is 0 (no smoothing). Several values can be supplied for optimization.
hold.time: Numeric vector of length 1. Specifies the time range at which selections will be merged (i.e. if 2 selections are separated by less than the specified hold.time they will be merged in to a single selection). Default is 0. Several values can be supplied for optimization.
mindur: Numeric vector giving the shortest duration (in seconds) of the signals to be detected. It removes signals below that threshold. Several values can be supplied for optimization.
maxdur: Numeric vector giving the longest duration (in seconds) of the signals to be detected. It removes signals above that threshold. Several values can be supplied for optimization.
thinning: Numeric vector in the range 0~1 indicating the proportional reduction of the number of samples used to represent amplitude envelopes (i.e. the thinning of the envelopes). Usually amplitude envelopes have many more samples than those needed to accurately represent amplitude variation in time, which affects the size of the output (usually very large R objects / files). Default is 1 (no thinning). Higher sampling rates may afford higher size reduction (e.g. lower thinning values). Reduction is conducted by interpolation using approx. Note that thinning may decrease time precision, and the higher the thinning the less precise the time detection. Several values can be supplied for optimization.
parallel: Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing).
pb: Logical argument to control progress bar and messages. Default is TRUE.
by.sound.file: Logical to control if diagnostics are calculated for each sound file independently (TRUE) or for all sound files combined (FALSE, default).
bp: Numeric vector of length 2 giving the lower and upper limits of a frequency bandpass filter (in kHz). Default is NULL.
path: Character string containing the directory path where the sound files are located. If NULL (default) then the current working directory is used. Only needed if 'Y' is not supplied.
previous.output: Data frame with the output of a previous run of this function. This will be used to include previous results in the new output and avoid recalculating detection performance for parameter combinations previously evaluated.

Author

Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr).

Details

This function takes a selections data frame or 'selection_table' ('X') and the output of a auto_detec routine ('Y') and estimates the detection performance for different detection parameter combinations. This is done by comparing the position in time of the detection to those of the reference selections in 'X'. The function returns several diagnostic metrics to allow user to determine which parameter values provide a detection that more closely matches the selections in 'X'. Those parameters can be later used for performing a more efficient detection using auto_detec.

References

Araya-Salas, M., & Smith-Vidaurre, G. (2017). warbleR: An R package to streamline analysis of animal acoustic signals. Methods in Ecology and Evolution, 8(2), 184-191.

Examples

Run this code

{
# Save to temporary working directory
data(list = c("Phae.long1", "Phae.long2", "Phae.long3", "Phae.long4", "lbh_selec_table"))
writeWave(Phae.long1, file.path(tempdir(), "Phae.long1.wav"))
writeWave(Phae.long2, file.path(tempdir(), "Phae.long2.wav"))
writeWave(Phae.long3, file.path(tempdir(), "Phae.long3.wav"))
writeWave(Phae.long4, file.path(tempdir(), "Phae.long4.wav"))

# run auto_detec with thining
ad <- auto_detec(output = "list", thinning = 1 / 10, ssmooth = 300, path = tempdir())
optimize_auto_detec(X = lbh_selec_table, Y = ad, threshold = c(5, 10, 15), path = tempdir())
}

Run the code above in your browser using DataLab