Learn R Programming

ExomeDepth (version 1.1.16)

count.everted.reads: Count the number of everted reads for a set of BAM files.

Description

This is the ExomeDepth high level function that takes a GenomicRanges object, a list of indexed/sorted BAM files, and compute the number of everted reads in each of the defined bins.

Usage

count.everted.reads(
  bed.frame = NULL,
  bed.file = NULL,
  bam.files,
  index.files = bam.files,
  min.mapq = 20,
  include.chr = FALSE
)

Value

A data frame that contains the region and the number of identified reads in each bin.

Arguments

bed.frame

data.frame containing the definition of the regions. The first three columns must be chromosome, start, end.

bed.file

character file name. Target BED file with the definition of the regions. This file will only be used if no bed.frame argument is provided. No headers are assumed so remove them if they exist. Either a bed.file or a bed.frame must be provided for this function to run.

bam.files

character, list of BAM files to extract read count data from.

index.files

Optional character argument with the list of indexes for the BAM files, without the '.bai' suffix. If the indexes are simply obtained by adding .bai to the BAM files, this argument does not need to be specified.

min.mapq

numeric, minimum mapping quality to include a read.

include.chr

logical, if set to TRUE, this function will add the string 'chr' to the chromosome names of the target BED file.

Details

Everted reads are characteristic of the presence of duplications in a BAM files. This routine will parse a BAM files and the suggested use is to provide relatively large bins (for example gene based, and ExomeDepth has a genes.hg19 object that is appropriate for this) to flag the genes that contain such reads suggestive of a duplication. A manual check of the data using IGV is recommended to confirm that these reads are all located in the same DNA region, which would confirm the presence of a copy number variant.

References

Medvedev et al (2009) <https://doi.org/10.1038/nmeth.1374> "Computational methods for discovering structural variation with next-generation sequencing"

See Also

getBAMCounts

Examples

Run this code

data(genes.hg19)
bam_file <- system.file('extdata/minimum_1_25630000_25650000.bam',
                        package = 'ExomeDepth')
genes.hg19.TTC <- subset(genes.hg19, grepl(pattern = '^TTC34', genes.hg19[['name']]))
print(count.everted.reads (bed.frame = genes.hg19.TTC, bam.files = bam_file, min.mapq = 0))
print(count.everted.reads (bed.frame = genes.hg19.TTC, bam.files = bam_file, min.mapq = 35))

Run the code above in your browser using DataLab