Test on device-events using the mean-shift changepoint method originally described in Xu, et al 2015.
cp_mean(df, ...)# S3 method for mds_ts
cp_mean(df, ts_event = c(Count = "nA"), analysis_of = NA, ...)
# S3 method for default
cp_mean(
df,
analysis_of = NA,
eval_period = NULL,
alpha = 0.05,
cp_max = 100,
min_seglen = 6,
epochs = NULL,
bootstrap_iter = 1000,
replace = T,
zero_rate = 1/3,
...
)
Required input data frame of class mds_ts
or, for generic
usage, any data frame with the following columns:
Unique times of class Date
Either the event count or rate of class numeric
Further arguments passed onto cp_mean
methods
Required if df
is of class mds_ts
. Named string
indicating the variable corresponding to the event count or rate. Rate must
be calculated in a separate column in df
as it is not calculated by
default. The name of the string is an English description of what was
analyzed.
Default: c("Count"="nA")
corresponding to the event count column in
mds_ts
objects. Name is generated from mds_ts
metadata.
Example: c("Rate of Bone Filler Events in Canada"="rate")
Optional string indicating the English description of what
was analyzed. If specified, this will override the name of the
ts_event
string parameter.
Default: NA
indicates no English description for plain df
data frames, or ts_event
English description for df
data frames
of class mds_ts
.
Example: "Rate of bone cement leakage"
Optional positive integer indicating the number of unique times counting in reverse chronological order to assess. This will be used to establish the process mean and moving range.
Default: NULL
considers all times in df
.
Alpha or Type-I error rate for detection of a changepoint, in the range (0, 1).
Default: 0.05
detects a changepoint at an alpha level of 0.05 or 5%.
Maximum number of changepoints detectable. This supersedes the
theoretical max set by epochs
.
Default: 100
detects up to a maximum of 100 changepoints.
Minimum required length of consecutive measurements without a changepoint in order to test for an additional changepoint within.
Default: 6
requires a minimum of 6 consecutive measurements.
Maximum number of epochs allowed in the iterative search for
changepoints, where 2^epochs
is the theoretical max changepoints
findable. Within each epoch, all measurement segments with a minimum of
min_seglen
measurements are tested for a changepoint until no
additional changepoints are found.
Default: NULL
estimates max epochs from the number of observations or
measurements in df
and min_seglen
.
Number of bootstrap iterations for constructing the null distribution of means. Lowest recommended is 1000. Increasing iterations also increases p-value precision.
Default: 1000
uses 1000 bootstrap iterations.
When sampling for the bootstrap, perform sampling with or
without replacement. Unless your df
contains many measurements, and
definitely more than bootstrap_iter
, it makes the most sense to set
this to TRUE
.
Default: T
constructs bootstrap samples with replacement.
Required maximum proportion of event
s in df
(constrained by eval_period
) containing zeroes for this algorithm to
run. Because mean-shift changepoint does not perform well on time series with
many 0 values, a value >0 is recommended.
Default: 1/3
requires no more than 1/3 zeros in event
s in
df
in order to run.
A named list of class mdsstat_test
object, as follows:
Name of the test run
English description of what was analyzed
Named boolean of whether the test was run. The name contains the run status.
A standardized list of test run results: statistic
for the test statistic, lcl
and ucl
for the 95
confidence bounds, p
for the p-value, signal
status, and
signal_threshold
.
The test parameters
The data on which the test was run
mds_ts
: Mean-shift changepoint on mds_ts data
default
: Mean-shift changepoint on general data
Function cp_mean()
is an implementation of the mean-shift changepoint
method originally proposed by Xu, et al (2015) based on testing the
mean-centered absolute cumulative sum against a bootstrap null
distribution. This algorithm defines a signal as any changepoint found within
the last/most recent n=min_seglen
measurements of df
.
The parameters in this implementation can be interpreted as
follows. Changepoints are detected at an alpha
level based on
n=bootstrap_iter
bootstrap iterations (with or without replacement
using replace
) of the input time series
df
. A minimum of n=min_seglen
consecutive measurements without
a changepoint are required to test for an additional changepoint. Both
epochs
and cp_max
constrain the maximum possible number of
changepoints detectable as follows: within each epoch, each segment of
consecutive measurements at least n=min_seglen
measurements long are
tested for a changepoint, until no additional changepoints are found.
Xu, Zhiheng, et al. "Signal detection using change point analysis in postmarket surveillance." Pharmacoepidemiology and Drug Safety 24.6 (2015): 663-668.
# NOT RUN {
# Basic Example
data <- data.frame(time=c(1:25), event=as.integer(stats::rnorm(25, 100, 25)))
a1 <- cp_mean(data)
# Example using an mds_ts object
a2 <- cp_mean(mds_ts[[3]])
# Example using a derived rate as the "event"
data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure
a3 <- cp_mean(data, c(Rate="rate"))
# }
Run the code above in your browser using DataLab