Learn R Programming

negenes (version 1.0-12)

negenes: Estimate the number of essential genes in a genome

Description

Estimate, via a Gibbs sampler, the posterior distribution of the number of essential genes in a genome with data from a random transposon mutagenesis experiment. (See the technical report cited below.)

Usage

negenes(n.sites, counts, n.sites2 = NULL, counts2 = NULL,
  n.mcmc = 5000, skip = 49, burnin = 500, startp = 1,
  trace = TRUE, calc.prob = FALSE, return.output = FALSE)

Arguments

n.sites

A vector specifying the number of transposon insertion sites in each gene (alone). All elements must by strictly positive.

counts

A vector specifying the number of mutants observed for each gene (alone). Must be the same length as n.sites, and all elements must be non-negative integers.

n.sites2

A vector specfying the number of transposon insertion sites shared by adjacent genes. The ith element is the number of insertion sites shared by genes i and i+1. The last element is for sites shared by genes N and 1. If NULL, assume all are 0.

counts2

A vector specfying the number of mutants shared by adjacent gene (analogous to n.sites2). The ith element is the number of mutants at sites shared by genes i and i+1. The last element is for sites shared by genes N and 1. If NULL, assume all are 0.

n.mcmc

Number of Gibbs steps to perform.

skip

An integer; only save every skip + 1st step.

burnin

Number of initial Gibbs steps to run (output discarded).

startp

Initial proportion of genes for which no mutant was observed that will be assumed essential for the Gibbs sampler. (Genes for which a mutant was observed are assumed non-essential; other genes are assumed essential independent with this probability.)

trace

If TRUE, print iteration number occassionally.

calc.prob

If TRUE, return the log posterior probability (up to an additive constant) for each saved iteration.

return.output

If TRUE, include detailed Gibbs results in the output.

Value

A list with components n.essential (containing the total number of essential genes at each iteration of the Gibbs sampler) summary (a vector containing the estimated mean, SD, 2.5 percentile and 97.5 percentile of the posterior distribution of the number of essential genes.

The next component, geneprob, is a vector with one element for each gene, containing the estimated posterior probability that each gene is essential. These are Rao-Blackwellized estimates.

If the argument calc.prob was true, there will also be a component logprob containing the log (base e) of the posterior probability (up to an additive constant) at each Gibbs step.

If the argument return.output was true, there will also be a matrix with n.mcmc / (skip + 1) rows (corresponding to the Gibbs steps) and a column for each gene The entries in the matrix are either 0 (essential gene) or 1 (non-essential gene) according to the state of that gene at that step in the Gibbs sampler.

References

  • Blades, N. J. and Broman, K. W. (2002) Estimating the number of essential genes in a genome by random transposon mutagenesis. Technical Report MS02-20, Department of Biostatistics, Johns Hopkins University, Baltimore, MD. https://www.biostat.wisc.edu/~kbroman/publications/ms0220.pdf

  • Lamichhane et al. (2003) A post-genomic method for predicting essential genes at subsaturation levels of mutagenesis: application to Mycobacterium, tuberculosis. Proc Natl Acad Sci USA 100:7213-7218 doi:10.1073/pnas.1231432100

See Also

negenes::sim.mutants(), negenes::Mtb80()

Examples

Run this code
# NOT RUN {
data(Mtb80)

# simulate 44% of genes to be essential
essential <- rep(0,nrow(Mtb80))
essential[sample(1:nrow(Mtb80),ceiling(nrow(Mtb80)*0.44))] <- 1

# simulate 759 mutants
counts <- sim.mutants(Mtb80[,1], essential, Mtb80[,2], 759)

# run the Gibbs sampler without returning detailed output
# }
# NOT RUN {
output <- negenes(Mtb80[,1], counts[,1], Mtb80[,2], counts[,2])
# }
# NOT RUN {
# run the Gibbs sampler, returning the detailed output
# }
# NOT RUN {
output2 <- negenes(Mtb80[,1], counts[,1], Mtb80[,2], counts[,2], return=TRUE)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab