Computes an estimator optimizing the Gaussian likelihood over a
snipping set. The function snipEM.initialV
can be used to
perform some iterations to initialize V
.
snipEM(X, V, tol = 1e-04, maxiters = 500, maxiters.S = 1000, print.it = FALSE) snipEM.initialV(X, V, mu0, S0, maxiters.S = 100, greedy = TRUE)
Data.
Binary matrix of the same size as X
. Zeros correspond to initial snipped entries.
Tolerance for convergence. Default is 1e-4
.
Maximum number of iterations for the SM algorithm. Default is 500
.
Maximum number of iterations of the inner greedy snipping algorithm. Default is 1000
.
Logical; if TRUE
, partial results are print. Default is FALSE
.
Initial estimate for the mean vector that is used in the initialization stage.
Initial estimate for the covariance matrix that is used in the initialization stage.
Logical; if TRUE
, perform the greedy snipping algorithm in search for the binary
matrix that gives the largest likelihood value throughout maxiters.S
iterations.
If FALSE
, stop right after the snipping algorithm finds a binary matrix that gives a larger
likelihood value than the initial one. Default is TRUE
.
A list with the following elements:
mu |
Estimated location. |
S |
Estimated scatter matrix. |
V |
Final (optimal) V matrix. |
lik |
Gaussian log-likelihood at convergence. |
iter |
Number of outer iterations before convergence. |
This function computes the sclust
estimator of Farcomeni
(2014) with \(k=1\). It therefore provides a robust estimate of
location and scatter in presence of entry-wise outliers. It is
based on a snip-maximize (SM) algorithm. At the S step, the
likelihood is optimized over the set of snipped entries, at the M
step the location and scatter estimates are updated. The S step is
based on a greedy algorithm, unlike the one proposed in Farcomeni
(2014,2014a). The number of snipped entries sum(1-V)
is kept
fixed throughout.
Results depend on good initialization of the V
matrix. A
boxplot rule (see examples) usually works well. The function
snipEM.initialV
can be used to improve the initial choice
through some iterations updating only V
from initial
(robust) estimates mu0
and S0
. In the example, the
EMVE is used to obtain mu0
and S0
.
Farcomeni, A. (2014) Snipping for robust k-means clustering under component-wise contamination, Statistics and Computing, 24, 909-917
Farcomeni, A. (2014) Robust constrained clustering in presence of entry-wise outliers, Technometrics, 56, 102-111
# NOT RUN {
n=100
p=5
Xc <- matrix(rnorm(100*10),100,5)
# initial V
V <- matrix(1,n,p)
V[!is.na(match(as.vector(Xc),boxplot(as.vector(Xc),plot=FALSE)$out))] <- 0
Xna <- Xc
Xna[ which( V == 0) ] <- NA
resSEM <- snipEM(Xc, V)
# }
Run the code above in your browser using DataLab