Cluster Data Based on Different Methods
clusterData(
obj = NULL,
scaleData = TRUE,
cluster.method = c("mfuzz", "TCseq", "kmeans", "wgcna"),
TCseq_params_list = list(),
object = NULL,
min.std = 0,
cluster.num = NULL,
subcluster = NULL,
seed = 5201314,
...
)
A list containing the following clustering results:
wide.res: A wide-format data frame with clusters and normalized expression levels.
long.res: A long-format data frame for visualizations, containing cluster information, normalized values, cluster names, and memberships.
cluster.list: A list where each element contains genes belonging to a specific cluster.
type: The clustering method used ("mfuzz"
, "TCseq"
, "kmeans"
, or "wgcna"
).
geneMode: Currently set to "none"
(reserved for future use).
geneType: Currently set to "none"
(reserved for future use).
An input object that can take one of two types: - A cell_data_set object for trajectory analysis. - A matrix or data.frame containing expression data.
Logical. Whether to scale the data (e.g., z-score normalization).
Character. Clustering method to use.
Options are one of "mfuzz"
, "TCseq"
, "kmeans"
, or "wgcna"
.
A list of additional parameters passed to the TCseq::timeclust
function.
A pre-calculated object required when using "wgcna"
as the clustering method.
Numeric. Minimum standard deviation for filtering expression data.
Integer. The number of clusters to identify.
A numeric vector of specific cluster IDs to include in the results.
If NULL
, all clusters are included.
An integer seed for reproducibility in clustering operations.
Additional arguments passed to internal functions such as pre_pseudotime_matrix
.
If the WGCNA method is selected, the object
parameter must contain a pre-calculated WGCNA network object.
This is typically obtained using the WGCNA package functions.
Use the subcluster
parameter to focus on specific clusters. Cluster IDs not included in the
subcluster
vector will be excluded from the final results.
JunZhang
This function performs clustering on input data using one of four methods: mfuzz, TCseq, kmeans, or wgcna. The clustering results include metadata, normalized data, and cluster memberships.
Depending on the selected cluster.method
, different clustering algorithms are used:
"mfuzz"
: Applies Mfuzz soft clustering method, suitable for identifying overlapping clusters.
"TCseq"
: Uses TCseq clustering for time-series expression data with support for additional parameters.
"kmeans"
: Employs standard k-means clustering via base R's stats::kmeans
.
"wgcna"
: Leverages pre-calculated WGCNA (Weighted Gene Co-expression Network Analysis) networks.
The function is designed to be flexible, allowing preprocessing (e.g., filtering by min.std
),
scaling the data (scaleData = TRUE
), and generating results compatible with data visualization pipelines.
data("exps")
# kmeans
ck <- clusterData(obj = exps,
cluster.method = "kmeans",
cluster.num = 8)
Run the code above in your browser using DataLab