The function creates a cross correlation time series of two input data
sets. The input data is cut into overlapping snippets and, optionally,
further smaller sub snippets for averaging the results per time snippet.
The data can be made subject to a series of optional preprocessing steps,
i.e. be aggregated, deconvolved, filtered, cut by standard deviation, and
sign-cut. the cross correlation function is calculated and returned for a
defined lag time window. The output of the function is supposed to be
used as input for the function ncc_process()
.
ncc_correlate(
start,
stop,
ID,
component,
dir,
window,
overlap = 0,
window_sub,
lag,
dt,
deconvolve = FALSE,
f,
pick = FALSE,
whiten = FALSE,
sd,
sign = FALSE,
cpu,
buffer = 0.05,
eseis = TRUE,
...
)
List
with spectrogram matrix, time and frequency vectors.
POSIXct
value, start time for data set to be analysed.
If a character string (or vector) is provided, the function will try to
convert it to POSIXct.
POSIXct
value, stop time for calculation.
If a character string (or vector) is provided, the function will try to
convert it to POSIXct.
Character
vector of length two, IDs of the station pair
to be processed. This can either be two separate stations (assuming the
components of those two stations are identical) or the same station
(assuming the components of that one stations differ).
Character
vector of length two, seismic components
to be used. If omitted, the function will use c("Z", "Z")
by default.
Character
value, path to directory that contains the
seismic files. See read_data
for details and constraints on the
data structure.
Numeric
value, window size for the correlation
calculation, in seconds.
Numeric
value, fraction of window overlap. Set to
zero by default, i.e. no overlap.
Numeric
value, size of the sub window used for
averaging (stacking) the correlation function within a window. If omitted,
no averaging will be performed.
Numeric
value, window size of the cross correlation
function, in seconds.
Numeric
mandatory value, sampling interval to which the input
data will be aggregated. Note that the aggregation dt has to be a multiple
of the data sets' dt values. If unsure, set value to original sampling interval
of input data. See details for further information.
Logical
value, option to deconvolve the input
data sets. If set to TRUE
, further parameters need to be provided,
see signal_deconvolve()
for details.
Numeric
value or vector of length two, lower and/or
upper cutoff frequencies (Hz). If two values are provided, the function
assumes a bandpass filter. See signal_filter
for details.
Logical
value, option to remove signal snippets where
erratic seismic events can be picked in any of the signals. The supported
method is pick_stalta
and requires the provision of all mandatory
arguments (sta
, lta
, on
, off
, freeze
).
Attention, the sta and lta lengths must be provided in seconds, not as
the number of samples!
Logical
value, option to perform spectral whitening.
Default is FALSE
(no whitening).
Numeric
value, option to perform standard deviation based
signal amplitude cutting. If omitted, no cutting will be performed. The
value can be any natural number, describing a multiplier of the standard
deviation magnitude of the data.
Logical
value, option to perform sign based cutting of
the signal amplitude. Default is FALSE
(no cutting).
Numeric
value, fraction of CPUs to use. If omitted,
only one CPU will be used.
Numeric
value, size of the two sided buffer that will
be added to the windowed time series to account for preprocessing
artifacts. The buffer is defined as the fraction of the window size. By
default it is 0.05
.
Logical
value, option to return data as eseis
object, default is TRUE
.
Further arguments passed to the functions
signal_deconvolve
, signal_filter
Michael Dietze
The sampling interval (dt
must be defined). It is wise to set it
to more than twice the filter's higher corner frequency (f[2]
).
Aggregation is recommended to improve computational efficiency, but is
mandatory if data sets of different sampling intervals are to be analysed.
In that case, it must be possible to aggregate the data sets to the
provided aggregation sampling interval (dt
), otherwise an error
will arise. As an example, if the two data sets have sampling intervals of
1/200
and 1/500
, the highest possible aggregated sampling
interval is 1/100
. See aux_commondt()
for further
information.
The function supports parallel processing. However, keep in mind that calculating the cross correlation functions for large data sets and large windows will draw serious amounts of memory. For example, a 24 h window of two seismic signals recorded at 200 Hz will easily use 15 GB of RAM. Combining this with parallel processing will multiply that memory size. Therefore, it is better think before going for too high ambitions, and check how the computer system statistics evolve with increasing windows and parallel operations.
Deconvolution is recommended if different station hardware and setup is used for the stations to analyse (i.e., different sensors, loggers or gain factors).
To account for biases due to brief events in the signals, the data sets
can be truncated (cut) in their amplitude. This cutting can either be
done based on the data sets' standard deviations (or a multiplicator of
those standard deviations, see signal_cut()
for further details),
using the argument sd = 1
, for one standard deviation.
Alternatively, the data can also be cut by their sign (i.e., positive and
negative values will be converted to 1
and -1
, respectively).
if (FALSE) {
## calculate correlogram
cc <- ncc_correlate(start = "2017-04-09 00:30:00",
stop = "2017-04-09 01:30:00",
ID = c("RUEG1", "RUEG2"),
dt = 1/10,
component = c("Z", "Z"),
dir = paste0(system.file("extdata",
package = "eseis"), "/"),
window = 600,
overlap = 0,
lag = 20,
deconvolve = TRUE,
sensor = "TC120s",
logger = "Cube3extBOB",
gain = 1,
f = c(0.05, 0.1),
sd = 1)
## plot output
plot_correlogram(cc)
}
Run the code above in your browser using DataLab