Learn R Programming

dynamicTreeCut (version 1.63-1)

cutreeDynamic: Adaptive Branch Pruning of Hierarchical Clustering Dendrograms

Description

This wrapper provides a common access point for two methods of adaptive branch pruning of hierarchical clustering dendrograms.

Usage

cutreeDynamic(
      dendro, cutHeight = NULL, minClusterSize = 20,

# Basic tree cut options method = "hybrid", distM = NULL, deepSplit = (ifelse(method=="hybrid", 1, FALSE)),

# Advanced options maxCoreScatter = NULL, minGap = NULL, maxAbsCoreScatter = NULL, minAbsGap = NULL,

minSplitHeight = NULL, minAbsSplitHeight = NULL,

# External (user-supplied) measure of branch split externalBranchSplitFnc = NULL, minExternalSplit = NULL, externalSplitOptions = list(), externalSplitFncNeedsDistance = NULL, assumeSimpleExternalSpecification = TRUE,

# PAM stage options pamStage = TRUE, pamRespectsDendro = TRUE, useMedoids = FALSE, maxDistToLabel = NULL, maxPamDist = cutHeight, respectSmallClusters = TRUE,

# Various options verbose = 2, indent = 0)

Arguments

dendro
A hierarchical clustering dendorgram such as one returned by hclust.
cutHeight
Maximum joining heights that will be considered. For method=="tree" it defaults to 0.99. For method=="hybrid" it defaults to 99% of the range between the 5th percentile and the maximum of the joining heights on the dendrogram.
minClusterSize
Minimum cluster size.
method
Chooses the method to use. Recognized values are "hybrid" and "tree".
distM
Only used for method "hybrid". The distance matrix used as input to hclust. If not given and method == "hybrid", the function will issue a warning and default to method = "tree".
deepSplit
For method "hybrid", can be either logical or integer in the range 0 to 4. For method "tree", must be logical. In both cases, provides a rough control over sensitivity to cluster splitting. The higher the value (or if TRUE), the more and smal
maxCoreScatter
Only used for method "hybrid". Maximum scatter of the core for a branch to be a cluster, given as the fraction of cutHeight relative to the 5th percentile of joining heights. See Details.
minGap
Only used for method "hybrid". Minimum cluster gap given as the fraction of the difference between cutHeight and the 5th percentile of joining heights.
maxAbsCoreScatter
Only used for method "hybrid". Maximum scatter of the core for a branch to be a cluster given as absolute heights. If given, overrides maxCoreScatter.
minAbsGap
Only used for method "hybrid". Minimum cluster gap given as absolute height difference. If given, overrides minGap.
minSplitHeight
Minimum split height given as the fraction of the difference between cutHeight and the 5th percentile of joining heights. Branches merging below this height will automatically be merged. Defaults to zero but is used only if minAbsSplitH
minAbsSplitHeight
Minimum split height given as an absolute height. Branches merging below this height will automatically be merged. If not given (default), will be determined from minSplitHeight above.
externalBranchSplitFnc
Optional function to evaluate split (dissimilarity) between two branches. Either a single function or a list in which each component is a function (see assumeSimpleExternalSpecification below for how to specify a single function). Each functi
minExternalSplit
Thresholds to decide whether two branches should be merged. It should be a numeric vector of the same length as the number of functions in externalBranchSplitFnc above. Only used for method "hybrid".
externalSplitOptions
Further arguments to function externalBranchSplitFnc. If only one external function is specified in externalBranchSplitFnc above, externalSplitOptions can be a named list of arguments or a list with one component th
externalSplitFncNeedsDistance
Optional specification of whether the external branch split functions need the distance matrix as one of their arguments. Either NULL or a logical vector with one element per branch split function that specifies whether the corresponding bra
assumeSimpleExternalSpecification
Logical: when minExternalSplit above is a scalar (has length 1), should the function assume a simple specification of externalBranchSplitFnc and externalSplitOptions? If TRUE, externalBranchSplitFn
pamStage
Only used for method "hybrid". If TRUE, the second (PAM-like) stage will be performed.
pamRespectsDendro
Logical, only used for method "hybrid". If TRUE, the PAM stage will respect the dendrogram in the sense that objects and small clusters will only be assigned to clusters that belong to the same branch that the objects or small clusters being
useMedoids
Only used for method "hybrid" and only if labelUnlabeled==TRUE. If TRUE, the second stage will be use object to medoid distance; if FALSE, it will use average object to cluster distance. The default (FALSE) is recommended.
maxDistToLabel
Deprecated, use maxPamDist instead. Only used for method "hybrid" and only if labelUnlabeled==TRUE. Maximum object distance to closest cluster that will result in the object assigned to that cluster.
maxPamDist
Only used for method "hybrid" and only if labelUnlabeled==TRUE. Maximum object distance to closest cluster that will result in the object assigned to that cluster. Defaults to cutHeight.
respectSmallClusters
Only used for method "hybrid" and only if labelUnlabeled==TRUE. If TRUE, branches that failed to be clusters in stage 1 only because of insufficient size will be assigned together in stage 2. If FALSE, all objects will be assigned individuall
verbose
Controls the verbosity of the output. 0 will make the function completely quiet, values up to 4 gradually increase verbosity.
indent
Controls indentation of printed messages (see verbose above). Each unit adds two spaces before printed messages; useful when several functions' output is to be nested.

Value

  • A vector of numerical labels giving assignment of objects to modules. Unassigned objects are labeled 0, the largest module has label 1, next largest 2 etc.

Details

This is a wrapper for two related but different methods for cluster detection in hierarchical clustering dendrograms.

In order to make the shape parameters maxCoreScatter and minGap more universal, their values are interpreted relative to cutHeight and the 5th percetile of the merging heights (we arbitrarily chose the 5th percetile rather than the minimum for reasons of stability). Thus, the absolute maximum allowable core scatter is calculated as maxCoreScatter * (cutHeight - refHeight) + refHeight and the absolute minimum allowable gap as minGap * (cutHeight - refHeight), where refHeight is the 5th percentile of the merging heights.

References

Langfelder P, Zhang B, Horvath S, 2007. http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting

See Also

hclust, cutreeHybrid, cutreeDynamicTree.