Learn R Programming

VariantFiltering (version 1.8.6)

MafDb2-class: MafDb2 class

Description

Class for annotation packages storing minor allele frequency data.

Usage

"mafByOverlaps"(x, ranges, pop, caching) "mafById"(x, ids, pop, caching) "populations"(x)

Arguments

x
A MafDb2 object.
ranges
Either a GRanges object, a GPos object or a character string vector with the format "CHR:START[-END]".
ids
A character string vector with variant identifiers annotated by the MAF data source, typically dbSNP 'rs' identifiers. Note that the mapping of these identifiers to genomic positions and MAF values might be a subset of the most up to date dbSNP 'rs' identifier assignment to variants. To access the latter, please use the snpsById() method from the BSgenome package with the desired SNPlocs.* package.
pop
A character string vector with the populations for which we want to retrieve MAF values.
caching
logical; TRUE (default) indicates that the function stores into main memeory the MAF data as it gets loaded from disk, improving performance; FALSE forces this function to load MAF data from disk each time, decreasing performance and memory requirements.

Details

The MafDb2 class and associated methods serve the purpose of creating annotation packages that store minor allele frequency data that can be queried by genomic position. Two such annotation packages are:

MafDb.ExAC.r0.3.1.snvs.hs37d5
MAF values from the ExAC consortium.
MafDb.ExAC.r0.3.1.nonTCGA.snvs.hs37d5
MAF values from the ExAC consortium.

This object class tries to reduce the disk space required to store allele frequencies (AFs) for millions of SNPs by coding AF float values, which range between 0 and 1, into a single-byte raw object type. To achieve this, the original AF values are rounded to one significant digit for AF < 0.01 and two significant digits for AF >= 0.01.

A further compression of these data is performed in the cases of variants with mutiple alternative alleles. In those cases, instead of storing the AF of each alternate allele only the maximum AF value is stored.

Examples

Run this code

  ## lookup allele frequencies for rs1129038, a SNP associated to blue and brown eye colors
  ## as reported by Eiberg et al. Blue eye color in humans may be caused by a perfectly associated
  ## founder mutation in a regulatory element located within the HERC2 gene inhibiting OCA2 expression.
  ## Human Genetics, 123(2):177-87, 2008 [http://www.ncbi.nlm.nih.gov/pubmed/18172690]

  if (require(MafDb.ExAC.r0.3.1.snvs.hs37d5)) {
    mafdb <- MafDb.ExAC.r0.3.1.snvs.hs37d5
    mafdb

    ## specialized interface
    populations(mafdb)

    rng <- GRanges("15", IRanges(28356859, 28356859))
    mafByOverlaps(mafdb, rng)
    mafByOverlaps(mafdb, "15:28356859-28356859")
    mafByOverlaps(mafdb, "15:28356859")
    mafById(mafdb, "rs1129038")
  }

Run the code above in your browser using DataLab