Learn R Programming

qtlnet (version 1.5.4)

bic.qtlnet: Pre-compute BIC values for qtlnet sampling.

Description

Pre-compute BIC values for qtlnet sampling to speed up MCMC sampling.

Usage

bic.qtlnet(cross, pheno.col, threshold, addcov = NULL, intcov = NULL,
  max.parents = 3, parents, verbose = TRUE, …)
bic.join(cross, pheno.col, …, max.parents = 3)
data(Pscdbp.bic)

Arguments

cross

Object of class cross. See read.cross.

pheno.col

Phenotype identifiers from cross object. May be numeric, logical or character.

threshold

Scalar or list of thresholds, one per each node.

addcov

Additive covariates for each phenotype (NULL if not used). If entered as scalar or vector (same format as pheno.col), then the same addcov is used for all phenotypes. Altenatively, may be a list of additive covariate identifiers.

intcov

Interactive covariates, entered in the same manner as addcov.

max.parents

Maximum number of parents per node. This reduces the complexity of graphs and shortens run time. Probably best to consider values of 3-5.

parents

List containing all possible parents up to max.parents in size. May be a subset

verbose

Print iteration and number of models fit.

Additional arguments passed to internal routines. In the case of bic.join, these are a list of objects produced by bic.qtlnet (see example below).

Value

Matrix with columns:

code

Binary code as decimal for the parents of a phenotype node, excluding the phenotype. Value is between 0 (no parents) and 2 ^ (length(pheno.col) - 1).

pheno.col

Phenotype column in reduced set, in range 1:length(pheno.col).

bic

BIC score for phenotype conditional on parents (and covariates).

Details

The most expensive part of calculations is running scanone on each phenotype with parent phenotypes as covariates. One strategy is to pre-compute the BIC contributions using a cluster and save them for later use.

We divide the job into three steps: 1) determine parents and divide into reasonable sized groups; 2) compute BIC scores using scanone on a grid of computers; 3) compute multiple MCMC runs on a grid of computers. See the example for details.

References

Chaibub Neto E, Keller MP, Attie AD, Yandell BS (2010) Causal Graphical Models in Systems Genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann Appl Statist 4: 320-339. http://dx.doi.org/10.1214/09-AOAS288

See Also

mcmc.qtlnet, parents.qtlnet

Examples

Run this code
# NOT RUN {
pheno.col <- 1:13
max.parents <- 12
size.qtlnet(pheno.col, max.parents)

# }
# NOT RUN {
## Compute all phenotype/parent combinations.
## This shows how to break up into many smaller jobs.

#########################################
## STEP 1: Preparation. Fast. Needed in steps 2 and 3.

pheno.col <- 1:13
max.parents <- 12
threshold <- 3.83

## Load cross object. Here we use internal object.
data(Pscdbp)
## or: load("Pscdbp.RData")
cross <- Pscdbp
## or: cross <- read.cross("Pscdbp.csv", "csv")

## Break up into groups to run on several machines.
## ~53 groups of ~1000, for a total of 53248 scanone runs.
parents <- parents.qtlnet(pheno.col, max.parents)
groups <- group.qtlnet(parents = parents, group.size = 1000)

## Save all relevant objects for later steps.
save(cross, pheno.col, max.parents, threshold, parents, groups,
  file = "Step1.RData", compress = TRUE)

#########################################
## STEP 2: Compute BIC scores. Parallelize.

## NB: Configuration of parallelization determined using Step 1 results.
## Load Step 1 computations.
load("Step1.RData")

## Parallelize this:
for(i in seq(nrow(groups)))
{
  ## Pre-compute BIC scores for selected parents.
  bic <- bic.qtlnet(cross, pheno.col, threshold,
    max.parents = max.parents,
    parents = parents[seq(groups[i,1], groups[i,2])])

  save(bic, file = paste("bic", i, ".RData", sep = ""), compress = TRUE)
}

#########################################
## STEP 3: Sample Markov chain (MCMC). Parallelize.

## NB: n.runs sets the number of parallel runs.
n.runs <- 100

## Load Step 1 computations.
load("Step1.RData")

## Read in saved BIC scores and combine into one object.
bic.group <- list()
for(i in seq(nrow(groups)))
{
  load(paste("bic", i, ".RData", sep = ""))
  bic.group[[i]] <- bic
}
saved.scores <- bic.join(cross, pheno.col, bic.group)


## Parallelize this:
for(i in seq(n.runs))
{
  ## Run MCMC with randomized initial network.
  mcmc <- mcmc.qtlnet(cross, pheno.col, threshold = threshold,
    max.parents = max.parents, saved.scores = saved.scores, init.edges = NULL)

  save(mcmc, file = paste("mcmc", i, ".RData", sep = ""), compress = TRUE)
}

#########################################
## STEP 4: Combine results for post-processing.

## NB: n.runs needed from Step 3.
n.runs <- 100

## Combine outputs together.
outs.qtlnet <- list()
for(i in seq(n.runs))
{
  load(paste("mcmc", i, ".RData", sep = ""))
  outs.qtlnet[[i]] <- mcmc
}
out.qtlnet <- c.qtlnet(outs.qtlnet)
summary(out.qtlnet)
print(out.qtlnet)

## End of parallel example.
#########################################
# }
# NOT RUN {
dim(Pscdbp.bic)
# }

Run the code above in your browser using DataLab