Learn R Programming

SeqArray (version 1.12.5)

SeqVarGDSClass: SeqVarGDSClass

Description

A SeqVarGDSClass object provides access to a GDS file containing Variant Call Format (VCF) data. It extends gds.class.

Arguments

Accessors

In the following code snippets x is a SeqVarGDSClass object.
granges(x): Returns the chromosome and position of variants as a GRanges object. Names correspond to the variant.id.
ref(x): Returns the reference alleles as a DNAStringSet.
alt(x): Returns the alternate alleles as a DNAStringSetList.
qual(x): Returns the quality scores.
filt(x): Returns the filter data.
fixed(x): Returns the fixed fields (ref, alt, qual, filt).
header(x): Returns the header.
rowRanges(x): Returns a GRanges object with metadata.
colData(x): Returns a DataFrame with sample identifiers and any information in the 'sample.annotation' node.
info(x, info=NULL): Returns the info fields as a DataFrame. info is a character vector with the names of fields to return (default is to return all).
geno(x, geno=NULL): Returns the geno (format) fields as a SimpleList. geno is a character vector with the names of fields to return (default is to return all).
Other data can be accessed with seqGetData.

Coercion methods

In the following code snippets x is a SeqVarGDSClass object.
asVCF(x, info=NULL, geno=NULL): Coerces a SeqVarGDSClass object to a VCF-class object. Row names correspond to the variant.id. info and geno specify the 'INFO' and 'GENO' (FORMAT) fields to return, respectively. If not specified, all fields are returned; if 'NA' no fields are returned. Use seqSetFilter prior to calling asVCF to specify samples and variants to return.

Details

A sequence GDS file is created from a VCF file with seqVCF2GDS. This file can be opened with seqOpen to create a SeqVarGDSClass object.

See Also

gds.class, seqVCF2GDS, seqOpen, seqGetData, seqSetFilter, seqClose

Examples

Run this code
gds <- seqOpen(seqExampleFileName("gds"))
gds

## sample ID
head(seqGetData(gds, "sample.id"))

## variants
# granges(gds)

## alleles as comma-separated character strings
head(seqGetData(gds, "allele"))

## alleles as DNAStringSet or DNAStringSetList
ref(gds)
v <- alt(gds)

## genotype
geno <- seqGetData(gds, "genotype")
dim(geno)
## dimensions are: allele, sample, variant
geno[1,1:10,1:5]

## rsID
head(seqGetData(gds, "annotation/id"))

## alternate allele count
head(seqGetData(gds, "annotation/info/AC"))

## individual read depth
depth <- seqGetData(gds, "annotation/format/DP")
names(depth)
## VCF header defined DP as variable-length data
table(depth$length)
## all length 1, so depth$data should be a sample by variant matrix
dim(depth$data)
depth$data[1:10,1:5]

seqClose(gds)

Run the code above in your browser using DataLab