Determine Optimal Clusters for Gene Expression or Pseudotime Data
getClusters(obj = NULL, ...)
A ggplot
object visualizing the Elbow plot, where:
The x-axis represents the number of clusters tested.
The y-axis represents the WSS for each cluster number.
The optimal cluster number can be visually identified at the "elbow point," where the reduction in WSS diminishes sharply.
a ggplot.
A data object representing the gene expression data or pseudotime data:
If the input is a cell_data_set
object (e.g., from Monocle3
),
the function preprocesses the data using pre_pseudotime_matrix
.
If the input is a numeric matrix or a data.frame
, it directly uses this data.
Default is NULL
.
Additional arguments passed to the preprocessing function
pre_pseudotime_matrix
(e.g., assays
, normalize
, etc.).
JunZhang
The getClusters
function identifies the optimal number of clusters for a given data object.
It supports multiple input types, including gene expression matrices and objects such as
cell_data_set
. The function implements the Elbow method to evaluate within-cluster
sum of squares (WSS) across a range of cluster numbers and visualizes the results.