Learn R Programming

BASiCS (version 0.7.30)

BASiCS_MCMC: BASiCS MCMC sampler

Description

MCMC sampler to perform Bayesian inference for single-cell mRNA sequencing datasets using the model described in Vallejos et al (2015).

Usage

BASiCS_MCMC(Data, N, Thin, Burn, ...)

Arguments

Data
N

Total number of iterations for the MCMC sampler. Use N>=max(4,Thin), N being a multiple of Thin.

Thin

Thining period for the MCMC sampler. Use Thin>=2.

Burn

Burn-in period for the MCMC sampler. Use Burn>=1, Burn<N, Burn being a multiple of Thin.

...

Optional parameters.

PriorParam

List of 7 elements, containing the hyper-parameter values required for the adopted prior (see Vallejos et al, 2015). All elements must be positive real numbers.

s2.mu

Scale hyper-parameter for the log-Normal(0,s2.mu) prior that is shared by all gene-specific expression rate parameters \(\mu[i]\). Default: s2.mu = 0.5.

a.delta

Only used when `PriorDelta == 'gamma'`. Shape hyper-parameter for the Gamma(a.delta,b.delta) prior that is shared by all gene-specific biological cell-to-cell heterogeneity hyper-parameters \(\delta[i]\). Default: a.delta = 1.

b.delta

Only used when `PriorDelta == 'gamma'`. Rate hyper-parameter for the Gamma(a.delta,b.delta) prior that is shared by all gene-specific biological cell-to-cell heterogeneity hyper-parameters \(\delta[i]\). Default: b.delta = 1.

p.phi

Dirichlet hyper-parameter for the joint of all (scaled by n) cell-specific mRNA content normalising constants \(\phi[j] / n\). Default: p.phi = rep(1, n).

a.s

Shape hyper-parameter for the Gamma(a.s,b.s) prior that is shared by all cell-specific capture efficiency normalising constants \(s[j]\). Default: a.s = 1.

b.s

Rate hyper-parameter for the Gamma(a.s,b.s) prior that is shared by all cell-specific capture efficiency normalising constants \(s[j]\). Default: b.s = 1.

a.theta

Shape hyper-parameter for the Gamma(a.theta,b.theta) prior for technical noise hyper-parameter \(\theta\). Default: a.theta = 1.

b.theta

Rate hyper-parameter for the Gamma(a.theta,b.theta) prior for technical noise hyper-parameter \(\theta\). Default: b.theta = 1.

s2.delta

Only used when `PriorDelta == 'log-normal'`. Scale hyper-parameter for the log-Normal(0,s2.delta) prior that is shared by all gene-specific expression rate parameters \(\delta[i]\). Default: s2.delta = 0.5.

AR

Optimal acceptance rate for adaptive Metropolis Hastings updates. It must be a positive number between 0 and 1. Default (and recommended): ar = 0.44

.

StopAdapt

Iteration at which adaptive proposals are not longer adapted. Use StopAdapt>=1. Default: StopAdapt = Burn.

StoreChains

If StoreChains = T, the slots of the generated BASiCS_Chain object are stored in separate .txt files. Each row of the output file containing an interation (RunName argument used to index file names). Default: StoreChains = F.

StoreAdapt

If StoreAdapt = T, trajectory of adaptive proposal variances (in log-scale) for each parameter are stored in separate .txt files. Each row of the output file containing an interation (RunName argument used to index file names). Default: StoreAdapt = F.

StoreDir

Directory where output files are stored. Only required if StoreChains = TRUE and/or StoreAdapt = TRUE). Default: StoreDir = getwd().

RunName

String used to index `.txt` files storing chains and/or adaptive proposal variances.

PrintProgress

If PrintProgress = FALSE, console-based progress report is suppressed.

ls.phi0

Starting value for the adaptive concentration parameter of the Metropolis proposals for phi.

PriorDelta

Specifies the prior used for delta. Possible values are 'gamma' (Gamma(a.theta,b.theta) prior) and 'log-normal' (log-Normal(0,s2.delta) prior) .

Start

In general, we do not advise to specify this argument. Default options have been tuned to facilitate convergence. It can be used to set user defined starting points for the MCMC algorithm. If used, it must be a list containing the following elements: mu0, delta0, phi0, s0, nu0, theta0, ls.mu0, ls.delta0, ls.phi0, ls.nu0, ls.theta0

Value

An object of class BASiCS_Chain-class.

References

Vallejos, Marioni and Richardson (2015). Bayesian Analysis of Single-Cell Sequencing data. PloS Computational Biology.

Vallejos, Richardson and Marioni (2016). Beyond comparisons of means: understanding changes in gene expression at the single cell level. Genome Biology.

Examples

Run this code
# NOT RUN {
# Built-in simulated dataset
Data = makeExampleBASiCS_Data()

# Only a short run of the MCMC algorithm for illustration purposes
# Longer runs migth be required to reach convergence
MCMC_Output <- BASiCS_MCMC(Data, N = 10000, Thin = 10, Burn = 5000, PrintProgress = FALSE)
head(displayChainBASiCS(MCMC_Output, Param = "mu"))
head(displayChainBASiCS(MCMC_Output, Param = "delta"))
head(displayChainBASiCS(MCMC_Output, Param = "phi"))
head(displayChainBASiCS(MCMC_Output, Param = "s"))
head(displayChainBASiCS(MCMC_Output, Param = "nu"))
head(displayChainBASiCS(MCMC_Output, Param = "theta"))

# Traceplots
plot(MCMC_Output, Param = "mu", Gene = 1)
plot(MCMC_Output, Param = "delta", Gene = 1)
plot(MCMC_Output, Param = "phi", Cell = 1)
plot(MCMC_Output, Param = "s", Cell = 1)
plot(MCMC_Output, Param = "nu", Cell = 1)
plot(MCMC_Output, Param = "theta", Batch = 1)

# Calculating posterior medians and 95% HPD intervals
MCMC_Summary <- Summary(MCMC_Output)
head(displaySummaryBASiCS(MCMC_Summary, Param = "mu"))
head(displaySummaryBASiCS(MCMC_Summary, Param = "delta"))
head(displaySummaryBASiCS(MCMC_Summary, Param = "phi"))
head(displaySummaryBASiCS(MCMC_Summary, Param = "s"))
head(displaySummaryBASiCS(MCMC_Summary, Param = "nu"))
head(displaySummaryBASiCS(MCMC_Summary, Param = "theta"))

# Graphical display of posterior medians and 95% HPD intervals
plot(MCMC_Summary, Param = "mu", main = "All genes")
plot(MCMC_Summary, Param = "mu", Genes = 1:10, main = "First 10 genes")
plot(MCMC_Summary, Param = "delta", main = "All genes")
plot(MCMC_Summary, Param = "delta", Genes = c(2,5,10,50,100), main = "5 customized genes")
plot(MCMC_Summary, Param = "phi", main = "All cells")
plot(MCMC_Summary, Param = "phi", Cells = 1:5, main = "First 5 cells")
plot(MCMC_Summary, Param = "s", main = "All cells")
plot(MCMC_Summary, Param = "s", Cells = 1:5, main = "First 5 cells")
plot(MCMC_Summary, Param = "nu", main = "All cells")
plot(MCMC_Summary, Param = "nu", Cells = 1:5, main = "First 5 cells")
plot(MCMC_Summary, Param = "theta")

# To constrast posterior medians of cell-specific parameters
plot(MCMC_Summary, Param = "phi", Param2 = "s")
plot(MCMC_Summary, Param = "phi", Param2 = "nu")
plot(MCMC_Summary, Param = "s", Param2 = "nu")

# To constrast posterior medians of gene-specific parameters
plot(MCMC_Summary, Param = "mu", Param2 = "delta", log = "x")

# Highly and lowly variable genes detection
#DetectHVG <- BASiCS_DetectHVG(Data, MCMC_Output, VarThreshold = 0.70, Plot = TRUE)
#DetectLVG <- BASiCS_DetectLVG(Data, MCMC_Output, VarThreshold = 0.40, Plot = TRUE)

#plot(MCMC_Summary, Param = "mu", Param2 = "delta", log = "x", col = 8)
#points(DetectHVG$Table[DetectHVG$Table$HVG==1,2], DetectHVG$Table[DetectHVG$Table$HVG==1,3],
#       pch = 16, col = "red", cex = 1)
#points(DetectLVG$Table[DetectLVG$Table$LVG==1,2], DetectLVG$Table[DetectLVG$Table$LVG==1,3],
#       pch = 16, col = "blue", cex = 1)

# If variance thresholds are not fixed
#BASiCS_VarThresholdSearchHVG(Data, MCMC_Output, VarThresholdsGrid = seq(0.70,0.75,by=0.01))
#BASiCS_VarThresholdSearchLVG(Data, MCMC_Output, VarThresholdsGrid = seq(0.40,0.45,by=0.01))

# }

Run the code above in your browser using DataLab