optimizeALS: Perform iNMF on scaled datasets

Description

Perform integrative non-negative matrix factorization to return factorized H, W, and V matrices. It optimizes the iNMF objective function using block coordinate descent (alternating non-negative least squares), where the number of factors is set by k. TODO: include objective function equation here in documentation (using deqn)

For each dataset, this factorization produces an H matrix (cells by k), a V matrix (k by genes), and a shared W matrix (k by genes). The H matrices represent the cell factor loadings. W is held consistent among all datasets, as it represents the shared components of the metagenes across datasets. The V matrices represent the dataset-specific components of the metagenes.

Usage

optimizeALS(object, ...)
# S3 method for list
optimizeALS(
  object,
  k,
  lambda = 5,
  thresh = 1e-06,
  max.iters = 30,
  nrep = 1,
  H.init = NULL,
  W.init = NULL,
  V.init = NULL,
  use.unshared = FALSE,
  rand.seed = 1,
  print.obj = FALSE,
  verbose = TRUE,
  ...
)
# S3 method for liger
optimizeALS(
  object,
  k,
  lambda = 5,
  thresh = 1e-06,
  max.iters = 30,
  nrep = 1,
  H.init = NULL,
  W.init = NULL,
  V.init = NULL,
  use.unshared = FALSE,
  rand.seed = 1,
  print.obj = FALSE,
  verbose = TRUE,
  ...
)

Value

liger object with H, W, and V slots set.

Arguments

object: liger object. Should normalize, select genes, and scale before calling.
...: Arguments passed to other methods
k: Inner dimension of factorization (number of factors). Run suggestK to determine appropriate value; a general rule of thumb is that a higher k will be needed for datasets with more sub-structure.
lambda: Regularization parameter. Larger values penalize dataset-specific effects more strongly (ie. alignment should increase as lambda increases). Run suggestLambda to determine most appropriate value for balancing dataset alignment and agreement (default 5.0).
thresh: Convergence threshold. Convergence occurs when |obj0-obj|/(mean(obj0,obj)) < thresh. (default 1e-6)
max.iters: Maximum number of block coordinate descent iterations to perform (default 30).
nrep: Number of restarts to perform (iNMF objective function is non-convex, so taking the best objective from multiple successive initializations is recommended). For easier reproducibility, this increments the random seed by 1 for each consecutive restart, so future factorizations of the same dataset can be run with one rep if necessary. (default 1)
H.init: Initial values to use for H matrices. (default NULL)
W.init: Initial values to use for W matrix (default NULL)
V.init: Initial values to use for V matrices (default NULL)
use.unshared: Whether to run UANLS method to integrate datasets with previously identified unshared variable genes. Have to run selectGenes with unshared = TRUE and scaleNotCenter it. (default FALSE).
rand.seed: Random seed to allow reproducible results (default 1).
print.obj: Print objective function values after convergence (default FALSE).
verbose: Print progress bar/messages (TRUE by default)

Examples

Run this code

ligerex <- createLiger(list(ctrl = ctrl, stim = stim))
ligerex <- normalize(ligerex)
ligerex <- selectGenes(ligerex)
ligerex <- scaleNotCenter(ligerex)
# Minimum specification for fast example pass
ligerex <- optimizeALS(ligerex, k = 5, max.iters = 1)

Run the code above in your browser using DataLab