LRCUSUM.runlength: Run length computation of a CUSUM detector

Description

Compute run length for a count data or categorical CUSUM. The computations are based on a Markov representation of the likelihood ratio based CUSUM.

Usage

LRCUSUM.runlength(mu,mu0,mu1,h,dfun, n, g=5,outcomeFun=NULL,...)

Arguments

$k-1 \times T$ matrix with true proportions, i.e. equal to mu0 or mu1 if one wants to compute e.g. $ARL_0$ or $ARL_1$.

mu0

$k-1 \times T$ matrix with in-control proportions

mu1

$k-1 \times T$ matrix with out-of-control proportion

The threshold h which is used for the CUSUM.

dfun

The probability mass function or density used to compute the likelihood ratios of the CUSUM. In a negative binomial CUSUM this is dnbinom, in a binomial CUSUM dbinom and in a multinomial CUSUM dmultinom.

Vector of length $T$ containing the total number of experiments for each time point.

The number of levels to cut the state space into when performing the Markov chain approximation. Sometimes also denoted $M$. Note that the quality of the approximation depends very much on $g$. If $T$ greater than, say, 50 its necessary to i

outcomeFun

A hook function to compute all possible outcome states to compute the likelihood ratio for. If NULL then the default function outcomeFunStandard(k,n) is used. This function uses the Cartesian product of 0:n

...

Additional arguments to send to dfun.

Value

A list with five components
PAn array of $g+2 \times g+2$ transition matrices of the approximation Markov chain.
pmfProbability mass function (up to length $T$) of the run length variable.
cdfCumulative density function (up to length $T$) of the run length variable.
arlIf the model is time homogenous (i.e. if $T==1$) then the ARL is computed based on the stationary distribution of the Markov chain. See the eqns in the reference for details. Note: If the model is not time homogeneous then the function returns NA and the ARL has to be approximated manually from the output. One could use sum(1:length(pmf) * pmf), which is an approximation because of using a finite support for a sum which should be from 1 to infinity.

encoding

latin1

Details

Brook and Evans (1972) formulated an approximate approach based on Markov chains to determine the PMF of the run length of a time-constant CUSUM detector. They describe the dynamics of the CUSUM statistic by a Markov chain with a discretized state space of size $g+2$. This is adopted to the time varying case in H�hle{Hoehle} (2010) and implemented in R using the ...notation such that it works for a very large class of distributions.

References

H�hle, M. (2010), Changepoint detection in categorical time series, Book chapter in T. Kneib and G. Tutz (Eds.), Statistical Modelling and Regression Structures - Festschrift in Honour of Ludwig Fahrmeir, Springer, pp. 377-397. Preprint available as http://www.stat.uni-muenchen.de/~hoehle/pubs/hoehle2010-preprint.pdf

H�hle, M. and Mazick, A. (2009), Aberration detection in R illustrated by Danish mortality monitoring, Book chapter to appear in T. Kass-Hout and X. Zhang (Eds.) Biosurveillance: A Health Protection Priority, CRCPress. Preprint available as http://www.stat.uni-muenchen.de/~hoehle/pubs/hoehle_mazick2009-preprint.pdf Brook, D. and Evans, D. A. (1972), An approach to the probability distribution of Cusum run length, Biometrika, 59:3, pp. 539--549.

Examples

Run this code

######################################################
#Run length of a time constant negative binomial CUSUM
######################################################

#In-control and out of control parameters
mu0 <- 10
alpha <- 1/2
kappa <- 2

#Density for comparison in the negative binomial distribution
dY <- function(y,mu,log=FALSE, alpha, ...) {
  dnbinom(y, mu=mu, size=1/alpha, log=log)
}

#In this case "n" is the maximum value to investigate the LLR for
#It is assumed that beyond n the LLR is too unlikely to be worth
#computing.
LRCUSUM.runlength( mu=t(mu0), mu0=t(mu0), mu1=kappa*t(mu0), h=5,
  dfun = dY, n=rep(100,length(mu0)), alpha=alpha)

h.grid <- seq(3,6,by=0.3)
arls <- sapply(h.grid, function(h) {
  LRCUSUM.runlength( mu=t(mu0), mu0=t(mu0), mu1=kappa*t(mu0), h=h,
  dfun = dY, n=rep(100,length(mu0)), alpha=alpha,g=20)$arl
})
plot(h.grid, arls,type="l",xlab="threshold h",ylab=expression(ARL[0]))


if (surveillance.options("allExamples"))
{
  ######################################################
  #Run length of a time varying negative binomial CUSUM
  ######################################################

  mu0 <- matrix(5*sin(2*pi/52 * 1:200) + 10,ncol=1)
  
  rl <- LRCUSUM.runlength( mu=t(mu0), mu0=t(mu0), mu1=kappa*t(mu0), h=2,
    dfun = dY, n=rep(100,length(mu0)), alpha=alpha,g=20)

  plot(1:length(mu0),rl$pmf,type="l",xlab="t",ylab="PMF")
  plot(1:length(mu0),rl$cdf,type="l",xlab="t",ylab="CDF")
}


########################################################
# Further examples contain the binomial, beta-binomial
# and multinomial CUSUMs. Hopefully, these will be added
# in the future.
########################################################

Run the code above in your browser using DataLab