Learn R Programming

hoardeR (version 0.10)

plotHit: Visualization of a cross-species hit

Description

For each cross-species hit the function plots the similarity within that area together with an optional annotation and coverage track.

Usage

plotHit(hits, flanking=1, window=NULL, annot=TRUE, coverage=FALSE,
          smoothPara=NULL, diagonal=0.25, verbose=TRUE, output=FALSE,
          hitSpecies=NULL, hitSpeciesAssembly=NULL, origSpecies=NULL,
          origSpeciesAssembly=NULL, fastaFolder=NULL, origAnnot=NULL,
          hitAnnot=NULL, nTick=5, which=NULL, figureFolder=NULL,
          figurePrefix=NULL, indexOffset=0, bamFolder=NULL, bamFiles=NULL,
          groupIndex=NULL, groupColor=NULL, countWindow=NULL)

Value

Optional, a table with intersection loci.

Arguments

hits

The hit object to be plotted.

flanking

Allowed flanking site in Mb.

window

Moving window size of similarity measure.

annot

Logical, add annotation track

coverage

Logical, add coverage track

smoothPara

Smoothing parameter for coverage

diagonal

Threshold for allowed diagonal similarity

verbose

Logical, shall the function give status updates

output

Logical, shall numerical results be given

hitSpecies

Scientific identifier of the hit species.

hitSpeciesAssembly

Version of the hit species assembly

origSpecies

Scientific name of the original species

origSpeciesAssembly

Version of the original species

fastaFolder

Location of the fasta files

origAnnot

Annotation object of the original species

hitAnnot

Annotation object of the hit species

nTick

Number of ticks on the annotation track

which

Which hits should be plotted

figureFolder

Folder where Figures should be stored

figurePrefix

Prefix of the figure filenames

indexOffset

Offset of the running index of the filenames

bamFolder

Folder with the bam-files

bamFiles

Filenames of the bam-files

groupIndex

Index of subgroups in the bamfiles

groupColor

Vector with colors, one for each subgroup

countWindow

Window size to count the reads from bam-files.

Author

Daniel Fischer

Details

This function is the workhorse of hoardeR and visualizes the findings of the blast and intersection runs. It is really flexibel to handle the hits and hence there are many different options. The required options are hits, hitSpecies, origSpecies and fastaFolder.

The hit object is an object as provided by intersectXMLAnnot and contains all intersections of interest (=intersections that are in close proximity of a gene in the hit species). Naturally the hit and the original species have to be specified as well as the folder, where the required fasta files are stored, or to where they should be downloaded. If the species are the default species from Ensembl (as can be seen in the data.frame species), the annotation and assembly will be automatically downloaded to the specified location on the harddrive. Changes from that version can be adjusted with the the hitSpeciesAssembly and origSpeciesAssembly options, but the filenames have still to match the convention, as they are provided by NCBI.

If in addition to the similarity also a coverage track should be added, the option coverage has to be set to TRUE. The option smoothPara sets then the level of smoothing of the coverage. By default no smoothing will be applied.

In case an annotation track is requested (annot=TRUE), the annotation objects need to be provided to the origAnnot and hitAnnot options.

The option diagonal defines the minimum level of similarity so that a (diagonal) match will be plotted. The colors are then towards green for total similarity and towards red for total disagree, based on a nucleotide mismatch matrix.

If the option verbose=TRUE is set, the function gives a verbose output while running. Further, if output=TRUE then, in addition to the figure also a data.frame with the numerical results is provided.

In case that hits contains more than one hit, the plotHit function plots for each hit a figure. In that case a folder should be provided to where the figures should be stored, this can be done with the figureFolder and figurePrefix options. In case only asserted hits of hits shall be plotted, they can be selected with the which option.

The function can also plot a coverage track over the similarity. For that, the option coverage=TRUE has to be set and a folder that contains the necessary bam-files has to be specified in bamFolder. By default all bam files in that folder are used, if only a subset is requested, the filenames can be specified in bamFiles. In case several bam-files are given, the average coverage at each loci is used. Further, if the data contains subgroups (e.g. case/control), the vector groupIndex gives the group labels. Naturally its length should be similar to bamFiles (or similar to the total amount of files in the bam-folder). In case that more than one group is plotted in the coverage track, their colors can be defined in groupColor. Of course, this vector has to be as long as the number of groups are defined. The option countWindow controls the moving window length in which the number of counts is calculated. The default is the same length as the hit.