Test on device-events using the mean-shift changepoint method originally described in Xu, et al 2015.
cp_mean(df, ...)# S3 method for mds_ts
cp_mean(df, ts_event = c(Count = "nA"), analysis_of = NA, ...)
# S3 method for default
cp_mean(
df,
analysis_of = NA,
eval_period = NULL,
alpha = 0.05,
cp_max = 100,
min_seglen = 6,
epochs = NULL,
bootstrap_iter = 1000,
replace = T,
zero_rate = 1/3,
...
)
Required input data frame of class mds_ts or, for generic
usage, any data frame with the following columns:
Unique times of class Date
Either the event count or rate of class numeric
Further arguments passed onto cp_mean methods
Required if df is of class mds_ts. Named string
indicating the variable corresponding to the event count or rate. Rate must
be calculated in a separate column in df as it is not calculated by
default. The name of the string is an English description of what was
analyzed.
Default: c("Count"="nA") corresponding to the event count column in
mds_ts objects. Name is generated from mds_ts metadata.
Example: c("Rate of Bone Filler Events in Canada"="rate")
Optional string indicating the English description of what
was analyzed. If specified, this will override the name of the
ts_event string parameter.
Default: NA indicates no English description for plain df
data frames, or ts_event English description for df data frames
of class mds_ts.
Example: "Rate of bone cement leakage"
Optional positive integer indicating the number of unique times counting in reverse chronological order to assess. This will be used to establish the process mean and moving range.
Default: NULL considers all times in df.
Alpha or Type-I error rate for detection of a changepoint, in the range (0, 1).
Default: 0.05 detects a changepoint at an alpha level of 0.05 or 5%.
Maximum number of changepoints detectable. This supersedes the
theoretical max set by epochs.
Default: 100 detects up to a maximum of 100 changepoints.
Minimum required length of consecutive measurements without a changepoint in order to test for an additional changepoint within.
Default: 6 requires a minimum of 6 consecutive measurements.
Maximum number of epochs allowed in the iterative search for
changepoints, where 2^epochs is the theoretical max changepoints
findable. Within each epoch, all measurement segments with a minimum of
min_seglen measurements are tested for a changepoint until no
additional changepoints are found.
Default: NULL estimates max epochs from the number of observations or
measurements in df and min_seglen.
Number of bootstrap iterations for constructing the null distribution of means. Lowest recommended is 1000. Increasing iterations also increases p-value precision.
Default: 1000 uses 1000 bootstrap iterations.
When sampling for the bootstrap, perform sampling with or
without replacement. Unless your df contains many measurements, and
definitely more than bootstrap_iter, it makes the most sense to set
this to TRUE.
Default: T constructs bootstrap samples with replacement.
Required maximum proportion of events in df
(constrained by eval_period) containing zeroes for this algorithm to
run. Because mean-shift changepoint does not perform well on time series with
many 0 values, a value >0 is recommended.
Default: 1/3 requires no more than 1/3 zeros in events in
df in order to run.
A named list of class mdsstat_test object, as follows:
Name of the test run
English description of what was analyzed
Named boolean of whether the test was run. The name contains the run status.
A standardized list of test run results: statistic
for the test statistic, lcl and ucl for the 95
confidence bounds, p for the p-value, signal status, and
signal_threshold.
The test parameters
The data on which the test was run
mds_ts: Mean-shift changepoint on mds_ts data
default: Mean-shift changepoint on general data
Function cp_mean() is an implementation of the mean-shift changepoint
method originally proposed by Xu, et al (2015) based on testing the
mean-centered absolute cumulative sum against a bootstrap null
distribution. This algorithm defines a signal as any changepoint found within
the last/most recent n=min_seglen measurements of df.
The parameters in this implementation can be interpreted as
follows. Changepoints are detected at an alpha level based on
n=bootstrap_iter bootstrap iterations (with or without replacement
using replace) of the input time series
df. A minimum of n=min_seglen consecutive measurements without
a changepoint are required to test for an additional changepoint. Both
epochs and cp_max constrain the maximum possible number of
changepoints detectable as follows: within each epoch, each segment of
consecutive measurements at least n=min_seglen measurements long are
tested for a changepoint, until no additional changepoints are found.
Xu, Zhiheng, et al. "Signal detection using change point analysis in postmarket surveillance." Pharmacoepidemiology and Drug Safety 24.6 (2015): 663-668.
# NOT RUN {
# Basic Example
data <- data.frame(time=c(1:25), event=as.integer(stats::rnorm(25, 100, 25)))
a1 <- cp_mean(data)
# Example using an mds_ts object
a2 <- cp_mean(mds_ts[[3]])
# Example using a derived rate as the "event"
data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure
a3 <- cp_mean(data, c(Rate="rate"))
# }
Run the code above in your browser using DataLab