fit_models: Fits a model for each gene in a cell_data_set object.

Description

This function fits a generalized linear model for each gene in a cell_data_set. Formulae can be provided to account for additional covariates (e.g. day collected, genotype of cells, media conditions, etc).

Usage

fit_models(
  cds,
  model_formula_str,
  expression_family = "quasipoisson",
  reduction_method = "UMAP",
  cores = 1,
  clean_model = TRUE,
  verbose = FALSE,
  ...
)

Arguments

cds

The cell_data_set upon which to perform this operation.

model_formula_str

A formula string specifying the model to fit for the genes.

expression_family

Specifies the family function used for expression responses. Can be one of "quasipoisson", "negbinomial", "poisson", "binomial", "gaussian", "zipoisson", or "zinegbinomial". Default is "quasipoisson".

reduction_method

Which method to use with clusters() and partitions(). Default is "UMAP".

cores

The number of processor cores to use during fitting.

clean_model

Logical indicating whether to clean the model. Default is TRUE.

verbose

Logical indicating whether to emit progress messages.

...

Additional arguments passed to model fitting functions.

Value

a tibble where the rows are genes and columns are

id character vector from rowData(cds)$id
gene_short_names character vector from rowData(cds)$gene_short_names
num_cells_expressed int vector from rowData(cds)$num_cells_expressed
gene_id character vector from row.names(rowData(cds))`
model GLM model list returned by speedglm
model_summary model summary list returned by summary(model)
status character vector of model fitting status: OK when model converged, otherwise FAIL