Learn R Programming

CIDER

CIDER is a meta-clustering workflow designed to handle scRNA-seq data that span multiple samples or conditions. Often, these datasets are confounded by batch effects or other variables. Many existing batch-removal methods assume near-identical cell population compositions across samples. CIDER, in contrast, leverages inter-group similarity measures to guide clustering without requiring such strict assumptions.

Highlights

  • Clustering: Overcome confounders in scRNA-seq data (e.g., batch effects) without requiring identical cell-type composition.
  • Evaluation metric: Assess whether integrated data from methods like Seurat-CCA, Harmony, or Scanorama preserve meaningful biological structure—no prior cell-type labels required.

Installation

You can install CIDER from github with:

# install.packages("devtools")
devtools::install_github('zhiyuan-hu-lab/CIDER')

Quick Start: Using CIDER as an Evaluation Metric

If you have already integrated your scRNA-seq data (e.g., using Seurat-CCA, Harmony, or Scanorama) and want to evaluate how well the biological populations align post-integration, you can use CIDER as follows.

  1. Before running CIDER evaluation functions, make sure that you have a Seurat object (e.g. seu.integrated) with corrected PCs in
seu.integrated@reductions$pca@cell.embeddings`
  • Seurat-CCA automatically put the corrected PCs there.
  • If other methods are used, the corrected PCs can be added using
seu.integrated@reductions$pca@cell.embeddings <- corrected.PCs
  1. Run hdbscan clustering (optional) and compute the IDER score:
library(CIDER)
seu.integrated <- hdbscan.seurat(seu.integrated)
ider <- getIDEr(seu.integrated, verbose = FALSE)
seu.integrated <- estimateProb(seu.integrated, ider)
  1. Visualize evaluation scores on t-SNE or UMAP:

The evaluation scores (IDER-based similarity and empirical p values) can be visualised by the scatterPlot function.

p1 <- scatterPlot(seu.integrated, "tsne", colour.by = "similarity")
p2 <- scatterPlot(seu.integrated, "tsne", colour.by = "pvalue") 
plot_grid(p1,p2, ncol = 2)

For a more detailed walkthrough, see the detailed tutorial of evaluation

Using CIDER for Clustering Tasks

Copy Link

Version

Install

install.packages('CIDER')

Monthly Downloads

97

Version

0.99.4

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Zhiyuan Hu

Last Published

February 7th, 2025

Functions in CIDER (0.99.4)

hdbscan.seurat

Initial Clustering for Evaluating Integration
gatherInitialClusters

Gather Initial Cluster Names
getDistMat

Calculate the Similarity Matrix
getIDEr

Compute IDER-Based Similarity
initialClustering

Initial Clustering
getGroupFit

Calculate IDER-Based Similarity Between Two Groups
plotNetwork

Plot Network Graph
scatterPlot

Scatterplot by a selected feature
mergeInitialClusters

Merge Initial Clusters
pancreas_meta

Pancreas Metadata
calculateDistMatOneModel

Calculate Distance Matrix Using a Single Model
downsampling

Downsampling Cells
plotDistMat

Plot Similarity Matrix with pheatmap
plotHeatmap

Plot Heatmap for the IDER-Based Similarity Matrix
estimateProb

Estimate the Empirical Probability of Whether Two Set of Cells from Distinct Batches Belong to the Same Population
finalClustering

Final Clustering Step for Meta-Clustering