Learn R Programming

REPPlab (version 0.9.6)

EPPlabOutlier: Function to Find Outliers for an epplab Object

Description

Function to decide wether observations are considered outliers or not in specific projection directions of an epplab object.

Usage

EPPlabOutlier(x, which = 1:ncol(x$PPdir), k = 3, location = mean, scale = sd)

Value

A list with class 'epplabOutlier' containing the following components:

outlier

A matrix with only zeros and ones. A value of 1 classifies the observation as an outlier in this projection direction.

k

The factor k used.

location

The name of the location estimator used.

scale

The name of the scale estimator used.

PPindex

The name of the PPindex used.

PPalg

The name of the PPalg used.

Arguments

x

An object of class epplab.

which

The directions in which outliers should be searched. The default is to look at all.

k

Numeric value to decide when an observation is considered an outlier or not. Default is 3. See details.

location

A function which gives the univariate location as an output. The default is mean.

scale

A function which gives the univariate scale as an output. The default is sd.

Author

Klaus Nordhausen

Details

Denote \(location_j\) as the location of the jth projection direction and analogously \(scale_j\) as its scale. Then an observation \(x\) is an outlier in the jth projection direction, if \(|x-location_j| \geq k \ scale_j\).

Naturally it is best to use for this purpose robust location and scale measures like median and mad for example.

References

Ruiz-Gazen, A., Larabi Marie-Sainte, S. and Berro, A. (2010), Detecting multivariate outliers using projection pursuit with particle swarm optimization, COMPSTAT2010, pp. 89-98.

See Also

EPPlab

Examples

Run this code

# creating data with 3 outliers
n <-300 
p <- 10
X <- matrix(rnorm(n*p),ncol=p)
X[1,1] <- 9
X[2,4] <- 7 
X[3,6] <- 8
# giving the data rownames, obs.1, obs.2 and obs.3 are the outliers.
rownames(X) <- paste("obs",1:n,sep=".")

PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20)
OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad)
OUT

Run the code above in your browser using DataLab