ballgownrsem: load RSEM data into a ballgown object

Description

Loads results of rsem-calculate-expression into a ballgown object for easy visualization, processing, and statistical testing

Usage

ballgownrsem(dir = "", samples, gtf, UCSC = TRUE, tfield = "transcript_id", attrsep = "; ", bamout = "transcript", pData = NULL, verbose = TRUE, meas = "all", zipped = FALSE)

Arguments

dir

output directory containing RSEM output for all samples (i.e. for each run of rsem-calculate-expression)

samples

vector of sample names (i.e., of the sample_name arguments used in each RSEM run)

gtf

path to GTF file of genes/transcripts used in your RSEM reference. (where the reference location was denoted by the reference_name argument used in rsem-calculate-expression). RSEM references can be created with or without a GTF file, but currently the ballgown reader requires the GTF file.

UCSC

set to TRUE if gtf comes from UCSC: quotes will be stripped from transcript identifiers if so.

tfield

What keyword identifies transcripts in the "attributes" field of gtf? Default 'transcript_id'.

attrsep

How are attributes separated in the "attributes" field of gtf? Default '; ' (semicolon-space).

bamout

set to 'genome' if --output-genome-bam was used when running rsem-calculate-expression; set to 'none' if --no-bam-output was used when running rsem-calculate-expression; otherwise use the default ('transcript').

pData

data frame of phenotype data, with rows corresponding to samples. The first column of pData must be equal to samples, and rows must be in the same order as samples.

verbose

If TRUE (as by default), status messages are printed during data loading.

meas

character vector containing either "all" or one of "FPKM" or "TPM". The resulting ballgown object will only contain the specified expression measurement for the transcripts. "all" creates the full object.

zipped

set to TRUE if all RSEM results files have been gzipped (end) in ".gz").

Value

a ballgown object with the specified expression measurements and structure specified by GTF.

Details

Currently exon- and intron-level measurements are not available for RSEM-generated ballgown objects, but development is ongoing.

Examples

Run this code

dataDir = system.file('extdata', package='ballgown')
gtf = file.path(dataDir, 'hg19_genes_small.gtf.gz')
rsemobj = ballgownrsem(dir=dataDir, samples=c('tiny', 'tiny2'), gtf=gtf,
    bamout='none', zipped=TRUE)
rsemobj