CIDER
CIDER is a meta-clustering workflow designed to handle scRNA-seq data that span multiple samples or conditions. Often, these datasets are confounded by batch effects or other variables. Many existing batch-removal methods assume near-identical cell population compositions across samples. CIDER, in contrast, leverages inter-group similarity measures to guide clustering without requiring such strict assumptions.
- Genome Biology (2021) publication: CIDER article
- Original prototype: Hu et al., Cancer Cell 2020
Highlights
- Clustering: Overcome confounders in scRNA-seq data (e.g., batch effects) without requiring identical cell-type composition.
- Evaluation metric: Assess whether integrated data from methods like Seurat-CCA, Harmony, or Scanorama preserve meaningful biological structure—no prior cell-type labels required.
Installation
You can install CIDER from github with:
# install.packages("devtools")
devtools::install_github('zhiyuan-hu-lab/CIDER')
Quick Start: Using CIDER as an Evaluation Metric
If you have already integrated your scRNA-seq data (e.g., using Seurat-CCA, Harmony, or Scanorama) and want to evaluate how well the biological populations align post-integration, you can use CIDER as follows.
- Before running CIDER evaluation functions, make sure that you have a
Seurat object (e.g.
seu.integrated
) with corrected PCs in
seu.integrated@reductions$pca@cell.embeddings`
- Seurat-CCA automatically put the corrected PCs there.
- If other methods are used, the corrected PCs can be added using
seu.integrated@reductions$pca@cell.embeddings <- corrected.PCs
- Run hdbscan clustering (optional) and compute the IDER score:
library(CIDER)
seu.integrated <- hdbscan.seurat(seu.integrated)
ider <- getIDEr(seu.integrated, verbose = FALSE)
seu.integrated <- estimateProb(seu.integrated, ider)
- Visualize evaluation scores on t-SNE or UMAP:
The evaluation scores (IDER-based similarity and empirical p values) can
be visualised by the scatterPlot
function.
p1 <- scatterPlot(seu.integrated, "tsne", colour.by = "similarity")
p2 <- scatterPlot(seu.integrated, "tsne", colour.by = "pvalue")
plot_grid(p1,p2, ncol = 2)
For a more detailed walkthrough, see the detailed tutorial of evaluation