Learn R Programming

cellWise (version 2.5.3)

cellHandler: cellHandler algorithm

Description

This function flags cellwise outliers in X and imputes them, if robust estimates of the center mu and scatter matrix Sigma are given. When the latter are not known, as is typically the case, one can use the function DDC which only requires the data matrix X. Alternatively, the unknown center mu and scatter matrix Sigma can be estimated robustly from X by the function DI.

Usage

cellHandler(X, mu, Sigma, quant = 0.99)

Value

A list with components:

  • Ximp
    The imputed data matrix.

  • indcells
    Indices of the cells which were flagged in the analysis.

  • indNAs
    Indices of the NAs in the data.

  • Zres
    Matrix with standardized cellwise residuals of the flagged cells. Contains zeroes in the unflagged cells.

  • Zres_denom
    Denominator of the standardized cellwise residuals.

  • cellPaths
    Matrix with the same dimensions as X, in which each row contains the path of least angle regression through the cells of that row, i.e. the order of the coordinates in the path (1=first, 2=second,...)

Arguments

X

X is the input data, and must be an \(n\) by \(d\) matrix or a data frame.

mu

An estimate of the center of the data

Sigma

An estimate of the covariance matrix of the data

quant

Cutoff used in the detection of cellwise outliers. Defaults to 0.99

Author

J. Raymaekers and P.J. Rousseeuw

References

J. Raymaekers and P.J. Rousseeuw (2020). Handling cellwise outliers by sparse regression and robust covariance. Journal of Data Science, Statistics, and Visualisation. tools:::Rd_expr_doi("10.52933/jdssv.v1i3.18")(link to open access pdf)

See Also

DI

Examples

Run this code
mu <- rep(0, 3)
Sigma <- diag(3) * 0.1 + 0.9
X <- rbind(c(0.5, 1.0, 5.0), c(-3.0, 0.0, 1.0))
n <- nrow(X); d <- ncol(X)
out <- cellHandler(X, mu, Sigma)
Xres <- X - out$Ximp # unstandardized residual
mean(abs(as.vector(Xres - out$Zres*out$Zres_denom))) # 0
W <- matrix(rep(0,n*d),nrow=n) # weight matrix 
W[out$Zres != 0] <- 1 # 1 indicates cells that were flagged
# For more examples, we refer to the vignette:
if (FALSE) {
vignette("DI_examples")
}

Run the code above in your browser using DataLab