HuberPairwise: Quadrant Covariance and Huberized Pairwise Scatter

Description

Computes the Quadrant Covariance (QC) or Huberized Pairwise Scatter as described in Alqallaf et al. (2002).

Usage

HuberPairwise( x, psi=c("huber","sign"), c0=1.345, computePmd=TRUE)

Value

An S4 object of class HuberPairwise-class which is a subclass of the virtual class CovRobMiss-class. The output S4 object contains the following slots:

`mu`	Estimated location. Can be accessed via `getLocation`.
`S`	Estimated scatter matrix. Can be accessed via `getScatter`.
`pmd`	Squared partial Mahalanobis distances. Can be accessed via `getDist`.
`pmd.adj`	Adjusted squared partial Mahalanobis distances. Can be accessed via `getDistAdj`.
`pu`	Dimension of the observed entries for each case. Can be accessed via `getDim`.
`R`	Estimated correlation matrix. Not meant to be accessed.
`call`	Object of class `"language"`. Not meant to be accessed.
`x`	Input data matrix. Not meant to be accessed.
`p`	Column dimension of input data matrix. Not meant to be accessed.
`estimator`	Character string of the name of the estimator used. Not meant to be accessed.

Arguments

x: a matrix or data frame. May contain missing values, but cannot contain columns with completely missing entries.
psi: loss function to be used in computing pairwise scatter. Default is "huber". If psi="sign", this yields QC. Other value includes "huber".
c0: tuning constant for the huber function. c0=0 would yield QC. Default is c0=1.345. This parameter is unnecessary if psi='sign'.
computePmd: logical indicating whether to compute partial Mahalanobis distances (pmd) and adjusted pmd. Default is TRUE.

Author

Andy Leung andy.leung@stat.ubc.ca

Details

As described in Alqallaf et al. (2002), this estimator requires a robust scale estimate and a location M-estimate, which will be used to transform the data through a loss-function to be outlier-free. Currently, this function takes MADN (normalized MAD) and median as the robust scale and location estimate to save computation time. By default, the loss function psi is a sign function, but users are encouraged to also try Huberized scatter with the loss function as \(\psi_c(x) = min( max(-c, x), c), c > 0, c=1.345\). The function does not adjust for intrinsic bias as described in Alqallaf et al. (2002). Missing values will be replaced by the corresponding column's median.

References

Alqallaf, F.A., Konis, K. P., R. Martin, D., Zamar, R. H. (2002). Scalable Robust Covariance and Correlation Estimates for Data Mining. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton.