Perform integrative non-negative matrix factorization to return factorized H, W, and V matrices. It optimizes the iNMF objective function using block coordinate descent (alternating non-negative least squares), where the number of factors is set by k. TODO: include objective function equation here in documentation (using deqn)
For each dataset, this factorization produces an H matrix (cells by k), a V matrix (k by genes), and a shared W matrix (k by genes). The H matrices represent the cell factor loadings. W is held consistent among all datasets, as it represents the shared components of the metagenes across datasets. The V matrices represent the dataset-specific components of the metagenes.
optimizeALS(object, ...)# S3 method for list
optimizeALS(
object,
k,
lambda = 5,
thresh = 1e-06,
max.iters = 30,
nrep = 1,
H.init = NULL,
W.init = NULL,
V.init = NULL,
use.unshared = FALSE,
rand.seed = 1,
print.obj = FALSE,
verbose = TRUE,
...
)
# S3 method for liger
optimizeALS(
object,
k,
lambda = 5,
thresh = 1e-06,
max.iters = 30,
nrep = 1,
H.init = NULL,
W.init = NULL,
V.init = NULL,
use.unshared = FALSE,
rand.seed = 1,
print.obj = FALSE,
verbose = TRUE,
...
)
liger
object with H, W, and V slots set.
liger
object. Should normalize, select genes, and scale before calling.
Arguments passed to other methods
Inner dimension of factorization (number of factors). Run suggestK to determine appropriate value; a general rule of thumb is that a higher k will be needed for datasets with more sub-structure.
Regularization parameter. Larger values penalize dataset-specific effects more strongly (ie. alignment should increase as lambda increases). Run suggestLambda to determine most appropriate value for balancing dataset alignment and agreement (default 5.0).
Convergence threshold. Convergence occurs when |obj0-obj|/(mean(obj0,obj)) < thresh. (default 1e-6)
Maximum number of block coordinate descent iterations to perform (default 30).
Number of restarts to perform (iNMF objective function is non-convex, so taking the best objective from multiple successive initializations is recommended). For easier reproducibility, this increments the random seed by 1 for each consecutive restart, so future factorizations of the same dataset can be run with one rep if necessary. (default 1)
Initial values to use for H matrices. (default NULL)
Initial values to use for W matrix (default NULL)
Initial values to use for V matrices (default NULL)
Whether to run UANLS method to integrate datasets with previously identified unshared variable genes. Have to run selectGenes with unshared = TRUE and scaleNotCenter it. (default FALSE).
Random seed to allow reproducible results (default 1).
Print objective function values after convergence (default FALSE).
Print progress bar/messages (TRUE by default)
ligerex <- createLiger(list(ctrl = ctrl, stim = stim))
ligerex <- normalize(ligerex)
ligerex <- selectGenes(ligerex)
ligerex <- scaleNotCenter(ligerex)
# Minimum specification for fast example pass
ligerex <- optimizeALS(ligerex, k = 5, max.iters = 1)
Run the code above in your browser using DataLab