For each cross-species hit the function plots the similarity within that area together with an optional annotation and coverage track.
plotHit(hits, flanking=1, window=NULL, annot=TRUE, coverage=FALSE,
smoothPara=NULL, diagonal=0.25, verbose=TRUE, output=FALSE,
hitSpecies=NULL, hitSpeciesAssembly=NULL, origSpecies=NULL,
origSpeciesAssembly=NULL, fastaFolder=NULL, origAnnot=NULL,
hitAnnot=NULL, nTick=5, which=NULL, figureFolder=NULL,
figurePrefix=NULL, indexOffset=0, bamFolder=NULL, bamFiles=NULL,
groupIndex=NULL, groupColor=NULL, countWindow=NULL)
Optional, a table with intersection loci.
The hit object to be plotted.
Allowed flanking site in Mb.
Moving window size of similarity measure.
Logical, add annotation track
Logical, add coverage track
Smoothing parameter for coverage
Threshold for allowed diagonal similarity
Logical, shall the function give status updates
Logical, shall numerical results be given
Scientific identifier of the hit species.
Version of the hit species assembly
Scientific name of the original species
Version of the original species
Location of the fasta files
Annotation object of the original species
Annotation object of the hit species
Number of ticks on the annotation track
Which hits should be plotted
Folder where Figures should be stored
Prefix of the figure filenames
Offset of the running index of the filenames
Folder with the bam-files
Filenames of the bam-files
Index of subgroups in the bamfiles
Vector with colors, one for each subgroup
Window size to count the reads from bam-files.
Daniel Fischer
This function is the workhorse of hoardeR and visualizes the findings of the blast and intersection runs. It is really flexibel to handle the hits and
hence there are many different options. The required options are hits
, hitSpecies
, origSpecies
and fastaFolder
.
The hit object is an object as provided by intersectXMLAnnot
and contains all intersections of interest (=intersections that are in close
proximity of a gene in the hit species). Naturally the hit and the original species have to be specified as well as the folder, where the required fasta
files are stored, or to where they should be downloaded. If the species are the default species from Ensembl (as can be seen in the data.frame
species
), the annotation and assembly will be automatically downloaded to the specified location on the harddrive. Changes from that
version can be adjusted with the the hitSpeciesAssembly
and origSpeciesAssembly
options, but the filenames have still to match the convention, as they
are provided by NCBI.
If in addition to the similarity also a coverage track should be added, the option coverage
has to be set to TRUE
. The option
smoothPara
sets then the level of smoothing of the coverage. By default no smoothing will be applied.
In case an annotation track is requested (annot=TRUE
), the annotation objects need to be provided to the origAnnot
and hitAnnot
options.
The option diagonal
defines the minimum level of similarity so that a (diagonal) match will be plotted. The colors are then towards green for
total similarity and towards red for total disagree, based on a nucleotide mismatch matrix.
If the option verbose=TRUE
is set, the function gives a verbose output while running. Further, if output=TRUE
then, in addition to the
figure also a data.frame with the numerical results is provided.
In case that hits
contains more than one hit, the plotHit
function plots for each hit a figure. In that case a folder should be
provided to where the figures should be stored, this can be done with the figureFolder
and figurePrefix
options. In case only
asserted hits of hits
shall be plotted, they can be selected with the which
option.
The function can also plot a coverage track over the similarity. For that, the option coverage=TRUE
has to be set and a folder that
contains the necessary bam-files has to be specified in bamFolder
. By default all bam files in that folder are used, if only a subset
is requested, the filenames can be specified in bamFiles
. In case several bam-files are given, the average coverage at each loci is used.
Further, if the data contains subgroups (e.g. case/control), the vector groupIndex
gives the group labels. Naturally its length should be
similar to bamFiles
(or similar to the total amount of files in the bam-folder). In case that more than one group is plotted in the
coverage track, their colors can be defined in groupColor
. Of course, this vector has to be as long as the number of groups are defined.
The option countWindow
controls the moving window length in which the number of counts is calculated. The default is the same length as the
hit.