hierarchical_term_clustering: Hierarchical Clustering of Enriched Terms

Description

Hierarchical Clustering of Enriched Terms

Usage

hierarchical_term_clustering(
  kappa_mat,
  enrichment_res,
  use_description = FALSE,
  clu_method = "average",
  plot_hmap = FALSE,
  plot_dend = TRUE
)

Arguments

kappa_mat

matrix of kappa statistics (output of create_kappa_matrix)

enrichment_res

data frame of pathfindR enrichment results. Must-have columns are "Term_Description" (if use_description = TRUE) or "ID" (if use_description = FALSE), "Down_regulated", and "Up_regulated". If use_active_snw_genes = TRUE, "non_Signif_Snw_Genes" must also be provided.

use_description

Boolean argument to indicate whether term descriptions (in the "Term_Description" column) should be used. (default = FALSE)

clu_method

the agglomeration method to be used (default = "average", see hclust)

plot_hmap

boolean to indicate whether to plot the kappa statistics clustering heatmap or not (default = FALSE)

plot_dend

boolean to indicate whether to plot the clustering dendrogram partitioned into the optimal number of clusters (default = TRUE)

Value

a vector of clusters for each enriched term in the enrichment results.

Details

The function initially performs hierarchical clustering of the enriched terms in `enrichment_res` using the kappa statistics (defining the distance as `1 - kappa_statistic`). Next, the clustering dendrogram is cut into k = 2, 3, ..., n - 1 clusters (where n is the number of terms). The optimal number of clusters is determined as the k value which yields the highest average silhouette width.

Examples

Run this code

# NOT RUN {
hierarchical_term_clustering(kappa_mat, enrichment_res)
hierarchical_term_clustering(kappa_mat, enrichment_res, method = "complete")
# }

Run the code above in your browser using DataLab