Learn R Programming

snipEM (version 1.0.1)

stEM: Snipping and trimming for location and scatter estimation with casewise and cellwise outliers

Description

Computes an estimator optimizing the Gaussian likelihood over a snipping and trimming set.

Usage

stEM(X, V, tol = 1e-4, maxiters = 500, maxiters.S = 1000, print.it = FALSE)

Arguments

X

Data.

V

Binary matrix of the same size as X. Zeros correspond to initial snipped entries, rows of zeros correspond to initial trimmed entries.

tol

Tolerance for convergence. Default is 1e-4.

maxiters

Maximum number of iterations for the SM algorithm. Default is 500.

maxiters.S

Maximum number of iterations of the inner greedy snipping algorithm. Default is 1000.

print.it

Logical; if TRUE, partial results are print. Default is FALSE.

Value

A list with the following elements:

mu Estimated location.
S Estimated scatter matrix.
V Final (optimal) V matrix.
lik Gaussian log-likelihood at convergence.
iter Number of outer iterations before convergence.

Details

This function combines computes the snipEM estimator of Farcomeni (2014) with trimming. Optimization over a trimming set is performed via usual concentration steps (Rousseeuw and van Driessen, 1999). It therefore provides a robust estimate of location and scatter in presence of entry-wise and case-wise outliers. The number of snipped entries and trimmed rows is kept fixed throughout. V must contain at least one row of zeros (otherwise use snipEM).

References

Farcomeni, A. (2014) Snipping for robust k-means clustering under component-wise contamination, Statistics and Computing, 24, 909-917

Farcomeni, A. (2014) Robust constrained clustering in presence of entry-wise outliers, Technometrics, 56, 102-111

Rousseeuw, P. J. and Van Driessen, K. (1999) A fast algorithm for the minimum covariance determinant estimator, Technometrics, 41, 212-223.

See Also

sclust, snipEM, sumlog, ldmvnorm

Examples

Run this code
# NOT RUN {
set.seed(1234)
X=matrix(rnorm(100*10),100,5)
X[1:5,]=50
X[6,1]=150

# initial V
V <- matrix(1, 100, 5)
V[1:5,]=0
Vtmp <- V[-c(1:5),]

# identify cells to be snipped
Vtmp[!is.na(match(as.vector(X[-c(1:5),]),boxplot(as.vector(X[-c(1:5),]),plot=FALSE)$out))] <- 0
V[-c(1:5),] <- Vtmp

resSTEM <- stEM(X, V)

# }

Run the code above in your browser using DataLab