Learn R Programming

OriGen (version 1.4.3)

ConvertMicrosatData: Microsatellite file conversion for known and unknown data

Description

This function converts two Microsatellite data files (one for the genotypes and one for locations) into the data format required for OriGen.

Usage

ConvertMicrosatData(DataFileName,LocationFileName)

Arguments

DataFileName
Name of file containing the genotypes of the various locations. The columns here would be LocationName, LocationNumber, Locus1, Locus2, etc. Each individual would take up 2 rows (one for each allele) with the same LocationName and LocationNumber. The value under Locus would be the length of the allele of that individual. Note that unknown individuals should have location number "-1".
LocationFileName
Space or tab delimited text file with the location information for the individuals. The columns are LocationName, LocationNumber, Latitude, and Longitude. Note that the first two columns must be in the same order as the FileName.

Value

List with the following components:
DataArray
An array giving the number alleles grouped by sample sites for each locus. The dimension of this array is [MaxAlleles,SampleSites,NumberSNPs].
SampleCoordinates
This is an array which gives the longitude and latitude of each of the found sample sites. The dimension of this array is [SampleSites,2], where the second dimension represents longitude and latitude respectively.
AllelesAtLocus
This shows the integer vector of alleles found at each locus.
MaxAlleles
This shows the maximum of AllelesAtLocus. The maximum number of alleles at all loci.
SampleSites
This shows the integer number of sample sites found.
NumberLoci
This shows the integer number of loci found.
NumberUnknowns
This is an integer value showing the number of unknowns found.
UnknownDataArray
An array showing the unknown individuals genetic data. The dimension of this array is [NumberUnknowns,2,NumberLoci].
LocationNames
This is a list of all the LocationNames (The first column of the input files).
DataFileName
This shows the inputted DataFileName.
LocationFileName
This shows the inputted LocationFileName.

References

Ranola J, Novembre J, Lange K (2014) Fast Spatial Ancestry via Flexible Allele Frequency Surfaces. Bioinformatics, in press.

See Also

ConvertMicrosatData for converting Microsatellite data files into a format appropriate for analysis, ConvertPEDData for converting Plink PED files into a format appropriate for analysis,

FitMultinomialModel for fitting allele surfaces to the converted Microsatellite data,

PlotAlleleFrequencySurface for a quick way to plot the resulting allele frequency surfaces from FitOriGenModel or FitMultinomialModel,;

Examples

Run this code

#Note that sample files MicrosatTrialDataSmall.txt and 
#LocationTrialDataSmall.txt are included in data for formatting.
#Note that this was done to allow inclusion of the test data in the package.

## Not run: MicrosatDataSmall=ConvertMicrosatData("MicrosatTrialDataSmall.txt",
# 		"LocationTrialDataSmall.txt")## End(Not run)
## Not run: str(MicrosatDataSmall)
## Not run: MicrosatAnalysisSmall=FitMultinomialModel(MicrosatDataSmall$DataArray,
# 		MicrosatDataSmall$SampleCoordinates,MaxGridLength=20)## End(Not run)
## Not run: str(MicrosatAnalysisSmall)
## Not run: PlotAlleleFrequencySurface(MicrosatAnalysisSmall)


Run the code above in your browser using DataLab