Learn R Programming

GSE (version 4.2-1)

HuberPairwise: Quadrant Covariance and Huberized Pairwise Scatter

Description

Computes the Quadrant Covariance (QC) or Huberized Pairwise Scatter as described in Alqallaf et al. (2002).

Usage

HuberPairwise( x, psi=c("huber","sign"), c0=1.345, computePmd=TRUE)

Value

An S4 object of class HuberPairwise-class which is a subclass of the virtual class CovRobMiss-class. The output S4 object contains the following slots:

muEstimated location. Can be accessed via getLocation.
SEstimated scatter matrix. Can be accessed via getScatter.
pmdSquared partial Mahalanobis distances. Can be accessed via getDist.
pmd.adjAdjusted squared partial Mahalanobis distances. Can be accessed via getDistAdj.
puDimension of the observed entries for each case. Can be accessed via getDim.
REstimated correlation matrix. Not meant to be accessed.
callObject of class "language". Not meant to be accessed.
xInput data matrix. Not meant to be accessed.
pColumn dimension of input data matrix. Not meant to be accessed.
estimatorCharacter string of the name of the estimator used. Not meant to be accessed.

Arguments

x

a matrix or data frame. May contain missing values, but cannot contain columns with completely missing entries.

psi

loss function to be used in computing pairwise scatter. Default is "huber". If psi="sign", this yields QC. Other value includes "huber".

c0

tuning constant for the huber function. c0=0 would yield QC. Default is c0=1.345. This parameter is unnecessary if psi='sign'.

computePmd

logical indicating whether to compute partial Mahalanobis distances (pmd) and adjusted pmd. Default is TRUE.

Author

Andy Leung andy.leung@stat.ubc.ca

Details

As described in Alqallaf et al. (2002), this estimator requires a robust scale estimate and a location M-estimate, which will be used to transform the data through a loss-function to be outlier-free. Currently, this function takes MADN (normalized MAD) and median as the robust scale and location estimate to save computation time. By default, the loss function psi is a sign function, but users are encouraged to also try Huberized scatter with the loss function as \(\psi_c(x) = min( max(-c, x), c), c > 0, c=1.345\). The function does not adjust for intrinsic bias as described in Alqallaf et al. (2002). Missing values will be replaced by the corresponding column's median.

References

Alqallaf, F.A., Konis, K. P., R. Martin, D., Zamar, R. H. (2002). Scalable Robust Covariance and Correlation Estimates for Data Mining. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton.