hclust function and
various distance metrics derived from percent methylation per base or per
region for each sample.Hierarchical Clustering using methylation data
The function clusters samples using hclust function and
various distance metrics derived from percent methylation per base or per
region for each sample.
clusterSamples(.Object, dist="correlation", method="ward",
sd.filter=TRUE,sd.threshold=0.5,
filterByQuantile=TRUE, plot=TRUE,chunk.size)# S4 method for methylBase
clusterSamples(.Object, dist, method, sd.filter,
sd.threshold, filterByQuantile, plot)
# S4 method for methylBaseDB
clusterSamples(.Object, dist = "correlation",
method = "ward", sd.filter = TRUE, sd.threshold = 0.5,
filterByQuantile = TRUE, plot = TRUE, chunk.size = 1e+06)
a methylBase or methylBaseDB object
the distance measure to be used. This must be one of
"correlation", "euclidean", "maximum",
"manhattan", "canberra", "binary" or "minkowski".
Any unambiguous abbreviation can be given. (default:"correlation")
the agglomeration method to be used. This should be
(an unambiguous abbreviation of) one of "ward", "single",
"complete", "average", "mcquitty", "median"
or "centroid". (default:"ward")
If TRUE, the bases/regions with low variation will be
discarded prior to clustering (default:TRUE)
A numeric value. If filterByQuantile is TRUE,
features whose standard deviations is less than the quantile denoted by
sd.threshold will be removed.
If filterByQuantile is FALSE, then features whose
standard deviations is less than the value of sd.threshold
will be removed.(default:0.5)
A logical determining if sd.threshold is to
be interpreted as a quantile of all Standard Deviation values from
bases/regions (the default), or as an absolute value
a logical value indicating whether to plot hierarchical clustering. (default:TRUE)
Number of rows to be taken as a chunk for processing the methylBaseDB objects, default: 1e6
a tree object of a hierarchical cluster analysis using a set
of dissimilarities for the n objects being clustered.
The parameter chunk.size is only used when working with
methylBaseDB objects,
as they are read in chunk by chunk to enable processing large-sized
objects which are stored as flat file database.
Per default the chunk.size is set to 1M rows, which should work for
most systems. If you encounter memory problems or
have a high amount of memory available feel free to adjust the
chunk.size.
# NOT RUN {
data(methylKit)
clusterSamples(methylBase.obj, dist="correlation", method="ward", plot=TRUE)
# }
Run the code above in your browser using DataLab