read_onemap: Read data from all types of progenies supported by OneMap

Description

Imports data derived from outbred parents (full-sib family) or inbred parents (backcross, F2 intercross and recombinant inbred lines obtained by self- or sib-mating). Creates an object of class onemap.

Usage

read_onemap(inputfile = NULL, dir = NULL, verbose = TRUE)

Value

An object of class onemap, i.e., a list with the following components:

geno: a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual.
n.ind: number of individuals.
n.mar: number of markers.
segr.type: a vector with the segregation type of each marker, as strings.
segr.type.num: a vector with the segregation type of each marker, represented in a simplified manner as integers, i.e. 1 corresponds to markers of type "A"; 2 corresponds to markers of type "B1.5"; 3 corresponds to markers of type "B2.6"; 4 corresponds to markers of type "B3.7"; 5 corresponds to markers of type "C.8"; 6 corresponds to markers of type "D1" and 7 corresponds to markers of type "D2". Markers for F2 intercrosses are coded as 1; all other crosses are left as NA.
input: the name of the input file.
n.phe: number of phenotypes.
pheno: a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual.
error: matrix containing HMM emission probabilities

Arguments

inputfile: the name of the input file which contains the data to be read.
dir: directory where the input file is located.
verbose: A logical, if TRUE it output progress status information.

Author

Gabriel R A Margarido, gramarga@gmail.com

Details

The file format is similar to that used by MAPMAKER/EXP (Lincoln et al., 1993). The first line indicates the cross type and is structured as data type {cross}, where cross must be one of "outcross", "f2 intercross", "f2 backcross", "ri self" or "ri sib". The second line contains five integers: i) the number of individuals; ii) the number of markers; iii) an indicator variable taking the value 1 if there is CHROM information, i.e., if markers are anchored on any reference sequence, and 0 otherwise; iv) a similar 1/0 variable indicating whether there is POS information for markers; and v) the number of phenotypic traits.

The next line contains sample IDs, separated by empty spaces or tabs. Addition of this sample ID requirement makes it possible for separate input datasets to be merged.

Next comes the genotype data for all markers. Each new marker is initiated with a “*” (without the quotes) followed by the marker name, without any space between them. Each marker name is followed by the corresponding segregation type, which may be: "A.1", "A.2", "A.3", "A.4", "B1.5", "B2.6", "B3.7", "C.8", "D1.9", "D1.10", "D1.11", "D1.12", "D1.13", "D2.14", "D2.15", "D2.16", "D2.17" or "D2.18" (without quotes), for full-sibs [see marker_type and Wu et al. (2002) for details]. Other cross types have special marker types: "A.H" for backcrosses; "A.H.B" for F2 intercrosses; and "A.B" for recombinant inbred lines.

After the segregation type comes the genotype data for the corresponding marker. Depending on the segregation type, genotypes may be denoted by ac, ad, bc, bd, a, ba, b, bc, ab and o, in several possible combinations. To make things easier, we have followed exactly the notation used by Wu et al. (2002). Allowed values for backcrosses are a and ab; for F2 crosses they are a, ab and b; for RILs they may be a and b. Genotypes must be separated by a space. Missing values are denoted by "-".

If there is physical information for markers, i.e., if they are anchored at specific positions in reference sequences (usually chromosomes), this is included immediately after the marker data. These lines start with special keywords *CHROM and *POS and contain strings and integers, respectively, indicating the reference sequence and position for each marker. These also need to be separated by spaces.

Finally, if there is phenotypic data, it will be added just after the marker or CHROM/POS data. They need to be separated by spaces as well, using the same symbol for missing information.

The example directory in the package distribution contains an example data file to be read with this function. Further instructions can be found at the tutorial distributed along with this package.

References

Lincoln, S. E., Daly, M. J. and Lander, E. S. (1993) Constructing genetic linkage maps with MAPMAKER/EXP Version 3.0: a tutorial and reference manual. A Whitehead Institute for Biomedical Research Technical Report.

Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.

Examples

Run this code

# \donttest{
 outcr_data <- read_onemap(inputfile= 
 system.file("extdata/onemap_example_out.raw", package= "onemap"))
# }

Run the code above in your browser using DataLab