Learn R Programming

polysat (version 1.7-7)

read.STRand: Read Genotypes Produced by STRand Software

Description

This function reads in data in a format derived from the “BTH” format for exporting genotypes from the allele calling software STRand.

Usage

read.STRand(file, sep = "\t", popInSam = TRUE)

Value

A "genambig" object containing genotypes, locus and sample names, population names, and population identities from the file.

Arguments

file

A text string indicating the file to read.

sep

Field delimiter for the file. Tab by default.

popInSam

Boolean. If TRUE, fields from the “Pop” and “Ind” columns will be concatenated to create a sample name. If FALSE, only the “Ind” column will be used for sample names.

Author

Lindsay V. Clark

Details

This function does not read the files directly produced from STRand, but requires some simple clean-up in spreadsheet software. The BTH format in STRand produces two columns per locus. One of these columns should be deleted so that there is just one column per locus. Loci names should remain in the column headers. The column containing sample names should be deleted or renamed “Ind”. A “Pop” column will need to be added, containing population names. An “Ind” column is also necessary, containing either full sample names or a sample suffix to be concatenated with the population name (see popInSam argument).

STRand adds an asterisk to the end of any genotype with more than two alleles. read.STRand will automatically strip this asterisk out of the genotype.

Missing data is indicated by a zero in the file.

References

https://vgl.ucdavis.edu/STRand

Toonen, R. J. and Hughes, S. (2001) Increased Throughput for Fragment Analysis on ABI Prism 377 Automated Sequencer Using a Membrane Comb and STRand Software. Biotechniques 31, 1320--1324.

See Also

read.table, read.GeneMapper, read.GenoDive, read.Structure, read.ATetra, read.Tetrasat, read.POPDIST, read.SPAGeDi

Examples

Run this code
# generate file to read
strtemp <- data.frame(Pop=c("P1","P1","P2","P2"),
                      Ind=c("a","b","a","b"),
                      LocD=c("0","172/174","170/172/178*","172/176"),
                      LocG=c("130/136/138/142*","132/136","138","132/140/144*"))
myfile <- tempfile()
write.table(strtemp, file=myfile, sep="\t",
            row.names=FALSE, quote=FALSE)

# read the file
mydata <- read.STRand(myfile)
viewGenotypes(mydata)
PopNames(mydata)

# alternative example with popInSam=FALSE
strtemp$Ind <- c("OH1","OH5","MT4","MT7")
write.table(strtemp, file=myfile, sep="\t",
            row.names=FALSE, quote=FALSE)
mydata <- read.STRand(myfile, popInSam=FALSE)
Samples(mydata)
PopNames(mydata)

Run the code above in your browser using DataLab