ncc_correlate: Noise Cross Correlation routine

Description

The function creates a cross correlation time series of two input data sets. The input data is cut into overlapping snippets and, optionally, further smaller sub snippets for averaging the results per time snippet. The data can be made subject to a series of optional preprocessing steps, i.e. be aggregated, deconvolved, filtered, cut by standard deviation, and sign-cut. the cross correlation function is calculated and returned for a defined lag time window. The output of the function is supposed to be used as input for the function ncc_process().

Usage

ncc_correlate(
  start,
  stop,
  ID,
  component,
  dir,
  window,
  overlap = 0,
  window_sub,
  lag,
  dt,
  deconvolve = FALSE,
  f,
  pick = FALSE,
  whiten = FALSE,
  sd,
  sign = FALSE,
  cpu,
  buffer = 0.05,
  eseis = TRUE,
  ...
)

Value

List with spectrogram matrix, time and frequency vectors.

Arguments

start: POSIXct value, start time for data set to be analysed. If a character string (or vector) is provided, the function will try to convert it to POSIXct.
stop: POSIXct value, stop time for calculation. If a character string (or vector) is provided, the function will try to convert it to POSIXct.
ID: Character vector of length two, IDs of the station pair to be processed. This can either be two separate stations (assuming the components of those two stations are identical) or the same station (assuming the components of that one stations differ).
component: Character vector of length two, seismic components to be used. If omitted, the function will use c("Z", "Z") by default.
dir: Character value, path to directory that contains the seismic files. See read_data for details and constraints on the data structure.
window: Numeric value, window size for the correlation calculation, in seconds.
overlap: Numeric value, fraction of window overlap. Set to zero by default, i.e. no overlap.
window_sub: Numeric value, size of the sub window used for averaging (stacking) the correlation function within a window. If omitted, no averaging will be performed.
lag: Numeric value, window size of the cross correlation function, in seconds.
dt: Numeric mandatory value, sampling interval to which the input data will be aggregated. Note that the aggregation dt has to be a multiple of the data sets' dt values. If unsure, set value to original sampling interval of input data. See details for further information.
deconvolve: Logical value, option to deconvolve the input data sets. If set to TRUE, further parameters need to be provided, see signal_deconvolve() for details.
f: Numeric value or vector of length two, lower and/or upper cutoff frequencies (Hz). If two values are provided, the function assumes a bandpass filter. See signal_filter for details.
pick: Logical value, option to remove signal snippets where erratic seismic events can be picked in any of the signals. The supported method is pick_stalta and requires the provision of all mandatory arguments (sta, lta, on, off, freeze). Attention, the sta and lta lengths must be provided in seconds, not as the number of samples!
whiten: Logical value, option to perform spectral whitening. Default is FALSE (no whitening).
sd: Numeric value, option to perform standard deviation based signal amplitude cutting. If omitted, no cutting will be performed. The value can be any natural number, describing a multiplier of the standard deviation magnitude of the data.
sign: Logical value, option to perform sign based cutting of the signal amplitude. Default is FALSE (no cutting).
cpu: Numeric value, fraction of CPUs to use. If omitted, only one CPU will be used.
buffer: Numeric value, size of the two sided buffer that will be added to the windowed time series to account for preprocessing artifacts. The buffer is defined as the fraction of the window size. By default it is 0.05.
eseis: Logical value, option to return data as eseis object, default is TRUE.
...: Further arguments passed to the functions signal_deconvolve, signal_filter

Author

Michael Dietze

Details

The sampling interval (dt must be defined). It is wise to set it to more than twice the filter's higher corner frequency (f[2]). Aggregation is recommended to improve computational efficiency, but is mandatory if data sets of different sampling intervals are to be analysed. In that case, it must be possible to aggregate the data sets to the provided aggregation sampling interval (dt), otherwise an error will arise. As an example, if the two data sets have sampling intervals of 1/200 and 1/500, the highest possible aggregated sampling interval is 1/100. See aux_commondt() for further information.

The function supports parallel processing. However, keep in mind that calculating the cross correlation functions for large data sets and large windows will draw serious amounts of memory. For example, a 24 h window of two seismic signals recorded at 200 Hz will easily use 15 GB of RAM. Combining this with parallel processing will multiply that memory size. Therefore, it is better think before going for too high ambitions, and check how the computer system statistics evolve with increasing windows and parallel operations.

Deconvolution is recommended if different station hardware and setup is used for the stations to analyse (i.e., different sensors, loggers or gain factors).

To account for biases due to brief events in the signals, the data sets can be truncated (cut) in their amplitude. This cutting can either be done based on the data sets' standard deviations (or a multiplicator of those standard deviations, see signal_cut() for further details), using the argument sd = 1, for one standard deviation. Alternatively, the data can also be cut by their sign (i.e., positive and negative values will be converted to 1 and -1, respectively).

Examples

Run this code


if (FALSE) {

## calculate correlogram
cc <- ncc_correlate(start = "2017-04-09 00:30:00", 
                     stop = "2017-04-09 01:30:00", 
                     ID = c("RUEG1", "RUEG2"), 
                     dt = 1/10,
                     component = c("Z", "Z"), 
                     dir = paste0(system.file("extdata", 
                                              package = "eseis"), "/"), 
                     window = 600, 
                     overlap = 0, 
                     lag = 20, 
                     deconvolve = TRUE, 
                     sensor = "TC120s",
                     logger = "Cube3extBOB",
                     gain = 1,
                     f = c(0.05, 0.1), 
                     sd = 1)

## plot output
plot_correlogram(cc)

}

Run the code above in your browser using DataLab