VariantFilteringResults
class is used to store the kind of object obtained as a result of an analysis using the functions unrelatedIndividuals()
, autosomalRecessiveHomozygous()
, autosomalRecessiveHeterozygous()
, autosomalDominant()
, deNovo()
and xLinked()
. Its purpose is to ease the task of filtering and prioritizing the variants annotated by those functions.
VariantFilteringResults
has the following set of accessor methods. length(x)
: total number of variants stored internatlly within the
VRanges
object. Note that this number will be typically larger than the number
of variantes in the input VCF object because each of them is copied for each combination
of alternate allele, annotated region and sample.
param(x)
: returns the VariantFilteringParam
input parameter
object employed in the call that produced the VariantFilteringResults
object x
.
inheritanceModel(x)
: returns the model of inheritance employed in the
call that produced the VariantFilteringResults
object x
.
samples(object)
: active samples from which the current filtered variants were derived. If the
x
was obtained with unrelatedIndividuals()
, then the replace method
samples(object)<-
can be used to restrict the subset of active samples. In every other case
(autosomalDominant()
, etc. ) active samples cannot be changed.
resetSamples(object)
: set back as active samples the initial set of samples specified
in the input parameter object.
sog(x)
: Sequence Ontology (SO) graph (actually, an acyclic digraph)
returned as a graphNEL
object, whose vertices are SO terms,
edges represent ontology relationships and vertex attributes vcfIdx
and
varIdx
contain what variants are annotated to each SO term. These annotations
can be directly retrieved from the SO graph with the nodeData()
function from the graph
package. The summary()
function described
in this manual page allows one to tally the number of variants in each SO term throughout
the entire SO hierarchy.
bamFiles(x)
: access and update the BamViews
object containing
references to BAM files from which the input VCF files were derived. Initially this is empty.
allVariants(x, groupBy="sample")
: returns a VRangesList
object with all variants grouped by default by sample. Using the argument groupBy
we can specify any metadata column to be used to group variants. If the value given to
groupBy
does not correspond to any such columns, a
VRanges
object with all variants together is returned.
filteredVariants(x, groupBy="sample")
: it works like allVariants(x)
but instead of returning all variants, it returns only those who pass the active
filters; see filters()
and cutoffs()
below.
VariantFilteringResults
object can be filtered using
the FilterRules
mechanism, defined in the S4Vectors
package,
by using the functions filters()
and cutoffs()
described below. There are
additional functions, also described in this section, to facilitate this task on the set
of core annotations provided by VariantFiltering
. filters(x)
: get the current FilterRules
object that defines
the available set of filter criteria that one can use to filter the variants contained in
x
. This can also be used as a replacement function filters(x)<-
to update
this set of filters. The actual filtering is done when calling the function
filteredVariants()
.
cutoffs(x)
: get and update cutoffs from the available filters.
softFilterMatrix(x)
: get and update the variant by filter matrix; see
softFilterMatrix()
in the VariantAnnotation
package.
dbSNPpresent(x)
: flag whether to filter variants present or absent from dbSNP (NA
-do not filter-, "Yes"
, "No"
).
variantType(x)
: filter by type of variant ( "SNV"
, "Insertion"
, "Deletion"
, "MNV"
, "Delins"
).
variantLocation(x)
: filter by variant location ("coding"
, "intron"
, "threeUTR"
, "fiveUTR"
, "intergenic"
, "spliceSite"
, "promoter"
).
variantConsequence(x)
: filter by variant consequence ("snynonymous"
, "nonsynonymous"
, "frameshift"
, "nonsense"
, "not translated"
).
aaChangeType(x)
: filter by type of change of amino acid ("Any"
, "Radical"
, "Conservative"
).
OMIMpresent(x)
: flag whether to filter variants whose associated genes are present or absent from OMIM (NA
-do not filter-, "Yes"
, "No"
).
naMAF(x)
: flag whether NA maximum MAF values should be included in the filtered variants.
maxMAF(x)
: maximum MAF value that a variant may meet among the selected populations.
minPhastCons(x)
: minimum phastCons score for nucleotide conservation (NA
-do not filter-, [0-1]).
minPhylostratum(x)
: minimum phylostratum for gene conservation (NA
-do not filter-, [1-20]).
MAFpop(x)
: selection of populations to use when filtering by maximum MAF value.
minScore5ss(x)
: minimum weight matrix score on a cryptic 5'ss. NA
indicates this filter is not applied.
minScore3ss(x)
: minimum weight matrix score on a cryptic 3'ss. NA
indicates this filter is not applied.
minCUFC(x)
: minimum absolute codon-usage log2 fold-change.
summary(object, method=c("SO", "SOfull", "bioc"))
: tally the current
filtered set of variants to features. By default, features are Sequence
Ontology (SO) terms to which variants are annotated by VariantFiltering
.
The method
argument allows the user to change this default setting to
tallying throughout the entire SO hierarchy. Both options, SO
and
SOfull
can be used in combination with the cutoff SOterms
; see
the vignette. The option method="bioc"
considers as features the
regions and consequences annotated by functions
locateVariants()
and
predictCoding()
from the VariantAnnotation
package. The result is returned as a data.frame
object.
plot(x, what, sampleName, flankingNt=20, showAlnNtCutoff=200, isPaired=FALSE, ...)
:
Plot variants using the Gviz
package. The argument what
can be
either a character vector specifying gene or variant identifiers or a
chromosome name, or a GRanges
object specifying a genomic region. The
argument sampleName
is optional and allows the user to plot the aligned
reads and coverage from a specific sample, located in the plotted region, when
the corresponding BAM file has been linked to the object with bamFiles()
.
The argument flankingNt
is a number of nucleotides to extend the plotting
region derived from the argument what
. The argument showAlnNtCutoff
is the region size cutoff below which it will be attempted to plot the aligned reads.
The argument isPaired
is passed directly to the Gviz
function
AlignmentsTrack()
which streams over the BAM file to plot the reads
and sets whether the BAM file contains single (default) or paired-end reads.
Further arguments in ...
are passed to the Gviz
function
plotTracks()
and can be used to fine-tune the final plot; see
the vignette of Gviz
to find out what these arguments are.
reportVariants(x, type=c("shiny", "csv", "tsv"), file=NULL)
:
Builds a report from the VariantFilteringResult
object x
. Using
the type
argument, the report can take the form of a flat file in CSV
or TSV format or a web shiny
app (default) that enables applying
functional annotation filters in an interactive manner. When the shiny
app is closed this method returns a
VariantFilteringResult
object with the corresponding filters
switched on or off according to how the app has been interactively used.
VariantFilteringResults
object using a VRanges
object, which also holds the variant annotations in its metadata columns. VariantFiltering adds the following core set of annotations.
locateVariants()
from the VariantAnnotation
package.
LOCATION
annotation.
txdb
argument
of the VariantFilteringParam()
function, typically an Entrez Gene identifier.
orgdb
argument
of the VariantFilteringParam()
function.
isSNV()
,
isInsertion()
,
isDeletion()
,
isSubstitution()
and
isDelins()
from the
VariantAnnotation
package.
snpdb
argument of the VariantFilteringParam()
function.
predictCoding()
from the VariantAnnotation
package.
TxDb
annotation package given
by the txdb
argument of the VariantFilteringParam()
function.
orgdb
argument
of the VariantFilteringParam()
function.
radicalAAchangeFilename
of the VariantFilteringParam()
function.
GT
,
considering those as positions 1 and 2.
AG
,
considering those as positions 1 and 2.
## Not run:
# library(VariantFiltering)
#
# CEUvcf <- file.path(system.file("extdata", package="VariantFiltering"),
# "CEUtrio.vcf.gz")
# CEUped <- file.path(system.file("extdata", package="VariantFiltering"),
# "CEUtrio.ped")
# param <- VariantFilteringParam(vcfFileNames=CEUvcf, pedFileName=CEUped)
# reHo <- autosomalRecessiveHomozygous(param)
# naMAF(reHo) <- FALSE
# maxMAF(reHo) <- 0.05
# reHo
# head(filteredVariants(reHo))
# reportVariants(reHo, type="csv", file="reHo.csv")
# ## End(Not run)
Run the code above in your browser using DataLab