FPDC: Factor probabilistic distance clustering

Description

An implementation of FPDC, a probabilistic factor clustering algorithm that involves a linear transformation of variables and a cluster optimizing the PD-clustering criterion

Usage

FPDC(data = NULL, k = 2, nf = 2, nu = 2)

Value

A class FPDclustering list with components

label: A vector of integers indicating the cluster membership for each unit
centers: A matrix of cluster centers
probability: A matrix of probability of each point belonging to each cluster
JDF: The value of the Joint distance function
iter: The number of iterations
explained: The explained variability
data: the data set

Arguments

data: A matrix or data frame such that rows correspond to observations and columns correspond to variables.
k: A numerical parameter giving the number of clusters
nf: A numerical parameter giving the number of factors for variables
nu: A numerical parameter giving the number of factors for units

Author

Cristina Tortora and Paul D. McNicholas

References

Tortora, C., M. Gettler Summa, M. Marino, and F. Palumbo. Factor probabilistic distance clustering (fpdc): a new clustering method for high dimensional data sets. Advanced in Data Analysis and Classification, 10(4), 441-464, 2016. doi:10.1007/s11634-015-0219-5.

Tortora C., Gettler Summa M., and Palumbo F.. Factor pd-clustering. In Lausen et al., editor, Algorithms from and for Nature and Life, Studies in Classification, Data Analysis, and Knowledge Organization DOI 10.1007/978-3-319-00035-011, 115-123, 2013.

Tortora C., Non-hierarchical clustering methods on factorial subspaces, 2012.

Examples

Run this code

if (FALSE) {
# Asymmetric data set clustering example (with shape 3).
data('asymmetric3')
x<-asymmetric3[,-1]

#Clustering
fpdas3=FPDC(x,4,3,3)

#Results
table(asymmetric3[,1],fpdas3$label)
Silh(fpdas3$probability)
summary(fpdas3)
plot(fpdas3)
}

if (FALSE) {
# Asymmetric data set clustering example (with shape 20).
data('asymmetric20')
x<-asymmetric20[,-1]

#Clustering
fpdas20=FPDC(x,4,3,3)

#Results
table(asymmetric20[,1],fpdas20$label)
Silh(fpdas20$probability)
summary(fpdas20)
plot(fpdas20)
}

if (FALSE) {
# Clustering example with outliers.
data('outliers')
x<-outliers[,-1]

#Clustering
fpdout=FPDC(x,4,5,4)

#Results
table(outliers[,1],fpdout$label)
Silh(fpdout$probability)
summary(fpdout)
plot(fpdout)
}

Run the code above in your browser using DataLab