AlignmentsTrack-class: AlignmentsTrack class and methods

Description

A class to represent short sequences that have been aligned to a reference genome as they are typically generated in next generation sequencing experiments.

Usage

AlignmentsTrack(range=NULL, start=NULL, end=NULL, width=NULL, strand, chromosome, genome,
                stacking="squish", id, cigar, mapq, flag, isize, groupid, status, md, seqs,
                name="AlignmentsTrack", isPaired=TRUE, importFunction, referenceSequence, ...)

Arguments

range

An optional meta argument to handle the different input types. If the range argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below).

The different input options for range are:

[object Object],[object Object],[object Object],[object Object]

start, end, width

Integer vectors, giving the start and the end coordinates for the individual track items, or their width. Two of the three need to be specified, and have to be of equal length or of length one, in which case this single value will be recycled. Otherwise, the usual R recycling rules for vectors do not apply here.

Character vector of read identifiers. Those identifiers have to be unique, i.e., each range representing a read needs to have a unique id.

cigar

A character vector of valid CIGAR strings describing details of the alignment. Typically those include alignemnts gaps or insertions and deletions, but also hard and soft clipped read regions. If missing, a fully mapped read without gaps or indels is assumed. Needs to be of equal length as the provided genomic coordinates, or of length 1.

mapq

A numeric vector of read mapping qualities. Needs to be of equal length as the provided genomic coordinates, or of length 1.

flag

A numeric vector of flag values. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.

isize

A numeric vector of empirical insert sizes. This only applies if the reads are paired. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.

groupid

A factor (or vector than can be coerced into one) defining the read pairs. Reads with the same groupid are considered to be mates. Please note that each read group may only have one or two members. Needs to be of equal length as the provided genomic coordinates, or of length 1.

status

A factor describing the mapping status of a read. Has to be one in mated, unmated or ambiguous. Needs to be of equal length as the provided genomic coordinates, or of length 1.

A character vector describing the mapping details. This is effectively and alternative to the CIGAR encoding and it removes the dependency on a reference sequence to figure out read mismatches. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.

seqs

DNAStringSet of read sequences.

strand

Character vector, the strand information for the reads. It may be provided in the form + for the Watson strand, - for the Crick strand or * for either one of the two. Needs to be of equal length as the provided genomic coordinates, or of length 1. Please note that paired reads need to be on opposite strands, and erroneous entries will result in casting of an error.

chromosome

The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE). Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx, where x may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to construct the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA. Please note that by definition all objects in the Gviz package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<- replacement method in order to change to a different active chromosome.

genome

The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided input, and eventually will fall back to the default value of NA.

stacking

The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack, full). Currently, only squish (make best use of the available space), dense (no stacking, collapse overlapping ranges), and hide (do not show any track items at all) are implemented.

name

Character scalar of the track's name used in the title panel when plotting.

isPaired

A logical scalar to determine whether the reads are paired or not. While this may be used to render paired-end data as single-end, the oppsite will typically not have any effect because the appropriate groupid settings will not be present. Thus setting isPaired to TRUE can usually be used to autodetect the pairing state of the input data.

importFunction

A user-defined function to be used to import the data from a file. This only applies when the range argument is a character string with the path to the input data file. The function needs to accept an argument x containing the file path and a second argument selection with the desired plotting ranges. It has to return a proper GRanges object with all the necessary metadata columns set. A single default import function is already implemented in the package for BAM files.

referenceSequence

An optional SequenceTrack object containing the reference sequence against which the reads have been aligned. This is only needed when mismatch information has to be added to the plot (i.e., the showMismatchs display parameter is TRUE) because this is normally not encoded in the BAM file. If not provided through this argument, the plotTracks function is smart enough to detect the presence of a SequenceTrack object in the track list and will use that as a reference sequence.

...

Additional items which will all be interpreted as further display parameters. See settings and the "Display Parameters" section below for details.

Value

The return value of the constructor function is a new object of class AlignmentsTrack or ReferenceAlignmentsTrack.

Objects from the Class

Objects can be created using the constructor function AlignmentsTrack.

Extends

Class "StackedTrack", directly. Class "RangeTrack", by class "StackedTrack", distance2. Class "GdObject", by class "StackedTrack", distance3.

Details

AlignmentTracks usually have two section: the coverage section on top showing a histogram of the read coverage, and the pile-up section below with the individual reads. Both can be toggled on or off using the type display parameter. If reference sequence has been provided either during object instantiation or with the track list to the call to plotTracks, sequence mismatch information will be shown in both sections: as a stacked histogram in the coverage plot and as colored boxes or characters (depending on available space) in for the pile-ups.

Examples

Run this code

## Creating objects
afrom <- 2960000
ato <- 3160000
alTrack <- AlignmentsTrack(system.file(package="Gviz", "extdata",
"gapped.bam"), isPaired=TRUE)
plotTracks(alTrack, from=afrom, to=ato, chromosome="chr12")

## Omit the coverage or the pile-ups part
plotTracks(alTrack, from=afrom, to=ato, chromosome="chr12",
type="coverage")
plotTracks(alTrack, from=afrom, to=ato, chromosome="chr12",
type="pileup")

## Including sequence information with the constructor
if(require(BSgenome.Hsapiens.UCSC.hg19)){
strack <- SequenceTrack(Hsapiens, chromosome="chr21")
afrom <- 44945200
ato <- 44947200
alTrack <- AlignmentsTrack(system.file(package="Gviz", "extdata",
"snps.bam"), isPaired=TRUE, referenceSequence=strack)
plotTracks(alTrack, chromosome="chr21", from=afrom, to=ato)

## Including sequence information in the track list
alTrack <- AlignmentsTrack(system.file(package="Gviz", "extdata",
"snps.bam"), isPaired=TRUE)
plotTracks(c(alTrack, strack), chromosome="chr21", from=44946590,
to=44946660)
}

Run the code above in your browser using DataLab