This can be used to select appropriate value of k for factorization of particular dataset. Plots median (across cells in all datasets) K-L divergence from uniform for cell factor loadings as a function of k. This should increase as k increases but is expected to level off above sufficiently high number of factors (k). This is because cells should have factor loadings which are not uniformly distributed when an appropriate number of factors is reached.
Depending on number of cores used, this process can take 10-20 minutes.
suggestK(
object,
k.test = seq(5, 50, 5),
lambda = 5,
thresh = 1e-04,
max.iters = 100,
num.cores = 1,
rand.seed = 1,
gen.new = FALSE,
nrep = 1,
plot.log2 = TRUE,
return.data = FALSE,
return.raw = FALSE,
verbose = TRUE
)
Matrix of results if indicated or ggplot object. Plots K-L divergence vs. k to console.
liger
object. Should normalize, select genes, and scale before calling.
Set of factor numbers to test (default seq(5, 50, 5)).
Lambda to use for all foctorizations (default 5).
Convergence threshold. Convergence occurs when |obj0-obj|/(mean(obj0,obj)) < thresh
Maximum number of block coordinate descent iterations to perform
Number of cores to use for optimizing factorizations in parallel (default 1)
Random seed for reproducibility (default 1).
Do not use optimizeNewK in factorizations. Results in slower factorizations. (default FALSE).
Number restarts to perform at each k value tested (increase to produce smoother curve if results unclear) (default 1).
Plot log2 curve for reference on K-L plot (log2 is upper bound and con sometimes help in identifying "elbow" of plot). (default TRUE)
Whether to return list of data matrices (raw) or dataframe (processed) instead of ggplot object (default FALSE).
If return.results TRUE, whether to return raw data (in format described below), or dataframe used to produce ggplot object. Raw data is list of matrices of K-L divergences (length(k.test) by n_cells). Length of list corresponds to nrep. (default FALSE)
Print progress bar/messages (TRUE by default)
# \donttest{
ligerex <- createLiger(list(ctrl = ctrl, stim = stim))
ligerex <- normalize(ligerex)
ligerex <- selectGenes(ligerex)
ligerex <- scaleNotCenter(ligerex)
suggestK(ligerex, k.test = c(5,6), max.iters = 1)
# }
Run the code above in your browser using DataLab