Function to cluster peaks by spectral similarity. A representative spectrum is selected for each peak in the provided peak table and used to construct a distance matrix based on spectral similarity (pearson correlation) between peaks. Hierarchical clustering with bootstrap resampling is performed on the resulting correlation matrix to classify peaks into by their spectral similarity.
cluster_spectra(
peak_table,
chrom_list,
peak_no = c(5, 100),
alpha = 0.95,
nboot = 1000,
plot_dend = TRUE,
plot_spectra = TRUE,
verbose = TRUE,
save = TRUE,
parallel = TRUE,
max.only = FALSE,
output = c("clusters", "pvclust", "both"),
...
)
Returns clusters and/or pvclust
object according to the value
of the output
argument.
If output = clusters
, returns a list of S4 cluster
objects.
If output = pvclust
, returns a pvclust
object.
If output = both
, returns a nested list containing [[1]]
the
pvclust
object, and [[2]]
the list of
S4 cluster
objects.
The cluster
objects consist of the following components:
peaks
: a character vector containing the names
of all peaks contained in the given cluster.
pval
: a numeric vector of length 1 containing
the bootstrap p-value (au) for the given cluster.
Peak table from get_peaktable
.
A list of chromatograms in matrix form (timepoints x wavelengths).
Minimum and maximum thresholds for the number of peaks a cluster may have.
Confidence threshold for inclusion of cluster.
Number of bootstrap replicates for
pvclust
.
Logical. If TRUE, plots dendrogram with bootstrap values.
Logical. If TRUE, plots overlapping spectra for each cluster.
Logical. If TRUE, prints progress report to console.
Logical. If TRUE, saves pvclust object to current directory.
Logical. If TRUE, use parallel processing for
pvclust
.
Logical. If TRUE, returns only highest level for nested dendrograms.
What to return. Either clusters
to return list of clusters,
pvclust
to return pvclust object, or both
to return both items.
Additional arguments to pvclust
.
Ethan Bass
A representative spectrum is selected for each peak in the provided peak table
and used to construct a distance matrix based on spectral similarity
(pearson correlation) between peaks. It is suggested to attach representative
spectra to the peak_table
using attach_ref_spectra
.
Otherwise, representative spectra are obtained from the chromatogram with the
highest absorbance at lambda max.
Hierarchical clustering with bootstrap
resampling is performed on the resulting correlation matrix, as implemented in
pvclust
. Finally, bootstrap values can be used
to select clusters that exceed a certain confidence threshold as defined by
alpha
. Clusters can also be filtered by the minimum and maximum
size of the cluster using the argument peak_no
. If max_only
is TRUE, only the largest cluster in a nested dendrogram of clusters meeting
the confidence threshold will be returned.
R. Suzuki & H. Shimodaira. 2006. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12):1540-1542. tools:::Rd_expr_doi("10.1093/bioinformatics/btl117").
# \donttest{
data(pk_tab)
data(Sa_warp)
cl <- cluster_spectra(pk_tab, nboot=100, max.only = FALSE, save = FALSE, alpha = .97)
# }
Run the code above in your browser using DataLab