Usage
ballgownrsem(dir = "", samples, gtf, UCSC = TRUE, tfield = "transcript_id", attrsep = "; ", bamout = "transcript", pData = NULL, verbose = TRUE, meas = "all", zipped = FALSE)
Arguments
dir
output directory containing RSEM output for all samples (i.e. for
each run of rsem-calculate-expression)
samples
vector of sample names (i.e., of the sample_name
arguments used in each RSEM run)
gtf
path to GTF file of genes/transcripts used in your RSEM reference.
(where the reference location was denoted by the reference_name
argument used in rsem-calculate-expression). RSEM references can be created
with or without a GTF file, but currently the ballgown reader requires the
GTF file.
UCSC
set to TRUE if gtf
comes from UCSC: quotes will be
stripped from transcript identifiers if so.
tfield
What keyword identifies transcripts in the "attributes" field
of gtf
? Default 'transcript_id'
.
attrsep
How are attributes separated in the "attributes" field of
gtf
? Default '; '
(semicolon-space).
bamout
set to 'genome'
if --output-genome-bam
was used
when running rsem-calculate-expression; set to 'none'
if
--no-bam-output
was used when running rsem-calculate-expression;
otherwise use the default ('transcript'
).
pData
data frame of phenotype data, with rows corresponding to
samples
. The first column of pData
must be equal to
samples
, and rows must be in the same order as samples
.
verbose
If TRUE (as by default), status messages are printed during
data loading.
meas
character vector containing either "all" or one of "FPKM" or
"TPM". The resulting ballgown object will only contain the specified
expression measurement for the transcripts. "all"
creates the full object.
zipped
set to TRUE if all RSEM results files have been gzipped (end)
in ".gz").