Creates a data frame in the VCF format for all SNPs and across all loci in the data set.
vcfinfo(string, pos = NULL)
a data frame with 10 different columns
Chromosome. Each locus is treated as different linkage group.
Co-ordinate. The coordinate of the SNP.
Identifier.
Reference allele. We assume that the reference allele is always an A. Note that this is not necessarily the major allele.
Alternative allele. We assume that the alternative allele is always a T.
Quality score out of 100. We assume that this score is always 100.
If this SNP passed quality filters.
Further information. Provides further information on the variants.
Information about the following columns. This column tells us how the number of reads is coded in the next column.
Number of reference-allele reads, alternative-allele reads and total depth of coverage observed for this population at this SNP.
is a character vector or a list where each entry contains a
character vector for a different locus. Each entry of this character vector
contains the information for a single SNP coded as R,A:DP. The output of
the vcflocus
or vcfloci
is the intended input
here.
is an optional input (default is NULL). If the actual position of the SNPs are known, they can be used as input here. When working with a single locus, this should be a numeric vector with each entry corresponding to the position of each SNP. If the data has multiple loci, this should be a list where each entry is a numeric vector with the position of the SNPs for a different locus.
This function combines the information coded as R,A:DP with other necessary information such as the chromosome of each SNP, the position of the SNP and the quality of the genotype among others. Note that in the character string, R is the number of reads of the reference allele, A is the number of reads of the alternative allele and DP is the total depth of coverage. Each row of the data frame corresponds to a different SNP.