Learn R Programming

traj (version 2.2.1)

Step3Clusters: Classify the Longitudinal Data Based on the Selected Measures.

Description

Classifies the trajectories by applying the k-medoids or k-means algorithm to the measures selected by Step2Selection.

Usage

Step3Clusters(
  trajSelection,
  algorithm = "k-medoids",
  metric = "euclidean",
  nstart = 200,
  iter.max = 100,
  nclusters = NULL,
  criterion = "Calinski-Harabasz",
  K.max = min(ceiling(sqrt(nrow(trajSelection$selection))), 10),
  B = 500
)

# S3 method for trajClusters print(x, ...)

# S3 method for trajClusters summary(object, ...)

Value

An object of class trajClusters; a list containing the result of the clustering, as well as a curated form of the arguments.

Arguments

trajSelection

object of class trajSelection as returned by Step2Selection.

algorithm

either "k-medoids" or "k-means". Determines the clustering algorithm to use. Defaults to "k-medoids".

metric

to be passed to the metric argument of pam if "k-medoids" is the chosen algorithm. Defaults to "euclidean".

nstart

to be passed to the nstart argument of kmeans if "k-means" is the chosen algorithm. Defaults to 200.

iter.max

to be passed to the iter.max argument of kmeans if "k-means" is the chosen algorithm. Defaults to 100.

nclusters

either NULL or the desired number of clusters. If NULL, the number of clusters is determined using the criterion chosen in criterion. Defaults to NULL.

criterion

criterion to determine the optimal number of clusters if nclusters is NULL. Either "GAP" or "Calinski-Harabasz". Defaults to "Calinski-Harabasz".

K.max

maximum number of clusters to be considered if nclusters is set to NULL.

B

to be passed to the B argument of clusGap if "GAP" is the chosen criterion.

x

object of class trajClusters.

...

further arguments passed to or from other methods.

object

object of class trajClusters.

Details

If "GAP" is the chosen criterion for determining the optimal number of clusters, the method described by Tibshirani et al. is implemented by the clusGap function.

Instead, if "Calinski-Harabasz" is the chosen criterion, the Calinski-Harabasz index is computed for each possible number of clusters between 2 and K.max and the optimal number of clusters is the maximizer of the Calinski-Harabasz index.

References

Tibshirani, R., Walther, G. and Hastie, T. (2001). Estimating the number of data clusters via the Gap statistic. Journal of the Royal Statistical Society B, 63, 411–423.

Tibshirani, R., Walther, G. and Hastie, T. (2000). Estimating the number of clusters in a dataset via the Gap statistic. Technical Report. Stanford.

See Also

Step2Selection

Examples

Run this code
if (FALSE) {
data("trajdata")
trajdata.noGrp <- trajdata[, -which(colnames(trajdata) == "Group")] #remove the Group column

m = Step1Measures(trajdata.noGrp, ID = TRUE, measures = 1:18)
s = Step2Selection(m)

s$RC$loadings

s2 = Step2Selection(m, select = c(10, 12, 8, 4))

c3.part <- Step3Clusters(s2, nclusters = 3)$partition
c4.part <- Step3Clusters(s2, nclusters = 4)$partition
c5.part <- Step3Clusters(s2, nclusters = 5)$partition

}

Run the code above in your browser using DataLab