FitKineticsGeneNtr: Fit a kinetic model using the degradation rate transformed NTR posterior distribution.

Description

Fit the standard mass action kinetics model of gene expression by maximum a posteriori on a model based on the NTR posterior. The fit takes only the NTRs into account and is completely independent on normalization, but it cannot be performed without assuming steady state. The parameters are fit per Condition.

Usage

FitKineticsGeneNtr(
  data,
  gene,
  slot = DefaultSlot(data),
  time = Design$dur.4sU,
  CI.size = 0.95,
  transformed.NTR.MAP = TRUE,
  exact.ci = FALSE,
  total.fun = median
)

Value

A named list containing the model fit:

data: a data frame containing the observed value used for fitting
Synthesis: the synthesis rate (in U/h, where U is the unit of the slot)
Degradation: the degradation rate (in 1/h)
Half-life: the RNA half-life (in h, always equal to log(2)/degradation-rate
conf.lower: a vector containing the lower confidence bounds for Synthesis, Degradation and Half-life
conf.upper: a vector containing the lower confidence bounds for Synthesis, Degradation and Half-life
f0: The abundance at time 0 (in U)
logLik: the log likelihood of the model
rmse: the total root mean square error
total: the total sum of all new and old RNA values used for fitting
type: always "ntr"

If Condition(data) is not NULL, the return value is a named list (named according to the levels of Condition(data)), each element containing such a structure.

Arguments

data: A grandR object
gene: The gene for which to fit the model
slot: The data slot to take expression values from
time: The column in the column annotation table representing the labeling duration
CI.size: A number between 0 and 1 representing the size of the credible interval
transformed.NTR.MAP: Use the transformed NTR MAP estimator instead of the MAP of the transformed posterior
exact.ci: compute exact credible intervals (see details)
total.fun: use this function to summarize the expression values (only relevant for computing the synthesis rate s)

Details

The start of labeling for all samples should be the same experimental time point. The fit gets more precise with multiple samples from multiple labeling durations.

The standard mass action kinetics model of gene expression arises from the following differential equation:

$$df/dt = s - d f(t)$$

This model assumes constant synthesis and degradation rates. Further assuming steady state allows to derive the function transforming from the NTR to the degradation rate d as $d(ntr)=-1/t log(1-ntr)$. Furthermore, if the ntr is (approximately) beta distributed, it is possible to derive the distribution of the transformed random variable for the degradation rate (see Juerges et al., Bioinformatics 2018).

This function primarily finds d by maximizing the degradation rate posterior distribution. For that, data does not have to be normalized, but this only works under steady-state conditions. The synthesis rate is then computed (under the assumption of steady state) as $s=f0 \cdot d$

The maximum-a-posteriori estimator is biased. Bias can be removed by a correction factor (which is done by default).

By default the chi-squared approximation of the log-posterior function is used to compute credible intervals. If exact.ci is used, the posterior is integrated numerically.

Examples

Run this code

sars <- ReadGRAND(system.file("extdata", "sars.tsv.gz", package = "grandR"),
                  design=c("Condition",Design$dur.4sU,Design$Replicate))
sars <- Normalize(sars)
sars <- subset(sars,columns=Condition=="Mock")
FitKineticsGeneNtr(sars,"SRSF6")