quantile_norm: Quantile align (normalize) factor loadings

Description

This process builds a shared factor neighborhood graph to jointly cluster cells, then quantile normalizes corresponding clusters.

Usage

quantile_norm(object, ...)
# S3 method for list
quantile_norm(
  object,
  quantiles = 50,
  ref_dataset = NULL,
  min_cells = 20,
  knn_k = 20,
  dims.use = NULL,
  do.center = FALSE,
  max_sample = 1000,
  eps = 0.9,
  refine.knn = TRUE,
  rand.seed = 1,
  ...
)
# S3 method for liger
quantile_norm(
  object,
  quantiles = 50,
  ref_dataset = NULL,
  min_cells = 20,
  knn_k = 20,
  dims.use = NULL,
  do.center = FALSE,
  max_sample = 1000,
  eps = 0.9,
  refine.knn = TRUE,
  rand.seed = 1,
  ...
)

Value

liger object with 'H.norm' and 'clusters' slot set.

Arguments

object: liger object. Should run optimizeALS before calling.
...: Arguments passed to other methods
quantiles: Number of quantiles to use for quantile normalization (default 50).
ref_dataset: Name of dataset to use as a "reference" for normalization. By default, the dataset with the largest number of cells is used.
min_cells: Minimum number of cells to consider a cluster shared across datasets (default 20)
knn_k: Number of nearest neighbors for within-dataset knn graph (default 20).
dims.use: Indices of factors to use for shared nearest factor determination (default 1:ncol(H[[1]])).
do.center: Centers the data when scaling factors (useful for less sparse modalities like methylation data). (default FALSE)
max_sample: Maximum number of cells used for quantile normalization of each cluster and factor. (default 1000)
eps: The error bound of the nearest neighbor search. (default 0.9) Lower values give more accurate nearest neighbor graphs but take much longer to computer.
refine.knn: whether to increase robustness of cluster assignments using KNN graph.(default TRUE)
rand.seed: Random seed to allow reproducible results (default 1)

Details

The first step, building the shared factor neighborhood graph, is performed in SNF(), and produces a graph representation where edge weights between cells (across all datasets) correspond to their similarity in the shared factor neighborhood space. An important parameter here is knn_k, the number of neighbors used to build the shared factor space.

Next we perform quantile alignment for each dataset, factor, and cluster (by stretching/compressing datasets' quantiles to better match those of the reference dataset). These aligned factor loadings are combined into a single matrix and returned as H.norm.

Examples

Run this code

ligerex <- createLiger(list(ctrl = ctrl, stim = stim))
ligerex <- normalize(ligerex)
ligerex <- selectGenes(ligerex)
ligerex <- scaleNotCenter(ligerex)
ligerex <- optimizeALS(ligerex, k = 5, max.iters = 1)
ligerex <- quantile_norm(ligerex)

Run the code above in your browser using DataLab