GeneRegionTrack-class: GeneRegionTrack class and methods

Description

A class to hold gene model data for a genomic region.

Usage

GeneRegionTrack(range=NULL, rstarts=NULL, rends=NULL, rwidths=NULL,
               strand, feature, exon, transcript, gene, symbol,
               chromosome, genome, stacking="squish",
               name="GeneRegionTrack", start=NULL, end=NULL,
               importFunction, stream=FALSE, ...)

Arguments

range

An optional meta argument to handle the different input types. If the range argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below).

The different input options for range are:

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

start, end

An integer scalar with the genomic start or end coordinate for the gene model range. If those are missing, the default value will automatically be the smallest (or largest) value, respectively in rstarts and rends for the currently active chromosome. When building a GeneRegionTrack from a TxDb object, these arguments can be used to subset the desired annotation data by genomic coordinates. Please note this in that case the chromosome parameter must also be set.

rstarts

An integer vector of the start coordinates for the actual gene model items, i.e., for the individual exons. The relationship between exons is handled via the gene and transcript factors. Alternatively, this can be a vector of comma-separated lists of integer coordinates, one vector item for each transcript, and each comma-separated element being the start location of a single exon within that transcript. Those lists will be exploded upon object instantiation and all other annotation arguments will be recycled accordingly to regenerate the exon/transcript/gene relationship structure. This implies the approriate number of items in all annotation and coordinates arguments.

rends

An integer vector of the end coordinates for the actual gene model items. Both rstarts and rends have to be of equal length.

rwidths

An integer vector of widths for the actual gene model items. This can be used instead of either rstarts or rends to specify the range coordinates.

feature

Factor (or other vector that can be coerced into one), giving the feature types for the individual track exons. When plotting the track to the device, if a display parameter with the same name as the value of feature is set, this will be used as the track item's fill color. Additionally, the feature type defines whether an element in the GeneRegionTrack is considered to be coding or non-coding. The details section as well as the section about the thinBoxFeature display parameter further below has more information on this. See also grouping for details.

exon

Character vector of exon identifiers. It's values will be used as the identifier tag when plotting to the device if the display parameter showExonId=TRUE.

strand

Character vector, the strand information for the individual track exons. It may be provided in the form + for the Watson strand, - for the Crick strand or * for either one of the two. Please note that all items within a single gene or transcript model need to be on the same strand, and erroneous entries will result in casting of an error.

transcript

Factor (or other vector that can be coerced into one), giving the transcript memberships for the individual track exons. All items with the same transcript identifier will be visually connected when plotting to the device. See grouping for details. Will be used as labels when showId=TRUE, and geneSymbol=FALSE.

gene

Factor (or other vector that can be coerced into one), giving the gene memberships for the individual track exons.

symbol

A factor with human-readable gene name aliases which will be used as labels when showId=TRUE, and geneSymbol=TRUE.

chromosome

The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE). Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx, where x may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to build the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA. Please note that by definition all objects in the Gviz package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<- replacement method in order to change to a different active chromosome. When creating a GeneRegionTrack from a TxDb object, the value of this parameter can be used to subset the data to fetch only transcripts from a single chromosome.

genome

The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided inputs, and eventually will fall back to the default value of NA.

stacking

The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack,full). Currently, only hide (don't show the track items, squish (make best use of the available space) and dense (no stacking at all) are implemented.

name

Character scalar of the track's name used in the title panel when plotting.

importFunction

A user-defined function to be used to import the data from a file. This only applies when the range argument is a character string with the path to the input data file. The function needs to accept an argument x containing the file path and has to return a proper GRanges object with all the necessary metadata columns set. A set of default import functions is already implemented in the package for a number of different file types, and one of these defaults will be picked automatically based on the extension of the input file name. If the extension can not be mapped to any of the existing import function, an error is raised asking for a user-defined import function via this argument. Currently the following file types can be imported with the default functions: gff, gff1, gff2, gff3, gtf.

stream

A logical flag indicating that the user-provided import function can deal with indexed files and knows how to process the additional selection argument when accessing the data on disk. This causes the constructor to return a ReferenceGeneRegionTrack object which will grab the necessary data on the fly during each plotting operation.

...

Additional items which will all be interpreted as further display parameters. See settings and the "Display Parameters" section below for details.

Value

The return value of the constructor function is a new object of class GeneRegionTrack.

Objects from the class

Objects can be created using the constructor function GeneRegionTrack.

Extends

Class "AnnotationTrack", directly.

Class "StackedTrack", by class "AnnotationTrack", distance2.

Class "RangeTrack", by class "AnnotationTrack", distance3.

Class "GdObject", by class "AnnotationTrack", distance4.

Details

A track containing all gene models in a particular region. The data are usually fetched dynamially from an online data store, but it is also possible to manully construct objects from local data. Connections to particular online data sources should be implemented as sub-classes, and GeneRegionTrack is just the commone denominator that is being used for plotting later on. There are several levels of data associated to a GeneRegionTrack:

[object Object],[object Object],[object Object],[object Object]

GeneRegionTrack objects also know about coding regions and non-coding regions (e.g., UTRs) in a transcript, and will indicate those by using different shapes (wide boxes for all coding regions, thinner boxes for non-coding regions). This is archived by setting the feature values of the object for non-coding elements to one of the options that are provided in the thinBoxFeature display parameters. All other elements are considered to be coding elements.

Examples

Run this code

## The empty object
GeneRegionTrack()

## Load some sample data
data(cyp2b10)

## Construct the object
grTrack <- GeneRegionTrack(start=26682683, end=26711643,
rstart=cyp2b10$start, rends=cyp2b10$end, chromosome=7, genome="mm9",
transcript=cyp2b10$transcript, gene=cyp2b10$gene, symbol=cyp2b10$symbol,
feature=cyp2b10$feature, exon=cyp2b10$exon,
name="Cyp2b10", strand=cyp2b10$strand)

## Directly from the data.frame
grTrack <- GeneRegionTrack(cyp2b10)

## From a TxDb object
if(require(GenomicFeatures)){
samplefile <- system.file("extdata", "hg19_knownGene_sample.sqlite", package="GenomicFeatures")
txdb <- loadDb(samplefile)
GeneRegionTrack(txdb)
GeneRegionTrack(txdb, chromosome="chr6", start=35000000, end=40000000)
}



## For some annoying reason the postscript device does not know about
## the sans font
if(!interactive())
{
font <- ps.options()$family
displayPars(grTrack) <- list(fontfamily=font, fontfamily.title=font)
}

## Plotting
plotTracks(grTrack)

## Track names
names(grTrack)
names(grTrack) <- "foo"
plotTracks(grTrack)

## Subsetting and splitting
subTrack <- subset(grTrack, from=26700000, to=26705000)
length(subTrack)
subTrack <- grTrack[transcript(grTrack)=="ENSMUST00000144140"]
split(grTrack, transcript(grTrack))

## Accessors
start(grTrack)
end(grTrack)
width(grTrack)
position(grTrack)
width(subTrack) <- width(subTrack)+100

strand(grTrack)
strand(subTrack) <- "-"

chromosome(grTrack)
chromosome(subTrack) <- "chrX"

genome(grTrack)
genome(subTrack) <- "hg19"

range(grTrack)
ranges(grTrack)

## Annotation
identifier(grTrack)
identifier(grTrack, "lowest")
identifier(subTrack) <- "bar"

feature(grTrack)
feature(subTrack) <- "foo"

exon(grTrack)
exon(subTrack) <- letters[1:2]

gene(grTrack)
gene(subTrack) <- "bar"

symbol(grTrack)
symbol(subTrack) <- "foo"

transcript(grTrack)
transcript(subTrack) <- c("foo", "bar")
chromosome(subTrack) <- "chr7"
plotTracks(subTrack)

values(grTrack)

## Grouping
group(grTrack)
group(subTrack) <- "Group 1"
transcript(subTrack)
plotTracks(subTrack)

## Collapsing transcripts
plotTracks(grTrack, collapseTranscripts=TRUE, showId=TRUE,
extend.left=10000, shape="arrow")

## Stacking
stacking(grTrack)
stacking(grTrack) <- "dense"
plotTracks(grTrack)

## coercion
as(grTrack, "data.frame")
as(grTrack, "UCSCData")

## HTML image map
coords(grTrack)
tags(grTrack)
grTrack <- plotTracks(grTrack)$foo
coords(grTrack)
tags(grTrack)

Run the code above in your browser using DataLab