PCOutlierDetection: Principal Component Outlier Detection(Intersection of all the methods applied on pc's)

Description

Takes a dataset, and finds its outliers based on principal components using combination of different method

Usage

PCOutlierDetection(x, k = 0.05 * nrow(x), cutoff = 0.95,
  Method = "euclidean", rnames = FALSE, depth = FALSE,
  dense = FALSE, distance = FALSE, dispersion = FALSE,
  infocut = 0.9)

Arguments

dataset for which outliers are to be found

No. of nearest neighbours to be used for for outlier detection using bootstrapping, default value is 0.05*nrow(x)

cutoff

Percentile threshold used for distance, default value is 0.95

Method

Distance method, default is Euclidean

rnames

Logical value indicating whether the dataset has rownames, default value is False

depth

Logical value indicating whether depth based method should be used or not, default is False

dense

Logical value indicating whether density based method should be used or not, default is False

distance

Logical value indicating whether distance based methods should be used or not, default is False

dispersion

Logical value indicating whether dispersion based methods should be used or not, default is False

infocut

Amount of variation for deciding the no. of principal components to be retained in the analysis, default is 0.9

Value

Outlier Observations: A matrix of outlier observations

Location of Outlier: Vector of Sr. no. of outliers

Details

OutlierDetection finds outlier observations for the principal component space using different methods and based on all the methods considered, labels an observation as outlier(intersection of all the methods). For bivariate data, it also shows the scatterplot of the data with labelled outliers.

Examples

Run this code

# NOT RUN {
PCOutlierDetection(iris[,-5])
# }

Run the code above in your browser using DataLab