Learn R Programming

adegenet (version 1.4-2)

df2genind: Convert a data.frame of genotypes to a genind object, and conversely.

Description

The function df2genind converts a data.frame (or a matrix) into a genind object. The data.frame must meet the following requirements: - genotypes are in row (one row per genotype) - markers are in columns - each element is a string of characters coding alleles with or without separator. If no separator is used, the function tries to find how many characters code each genotypes at a locus, but it is safer to state it (ncode argument). Uncomplete strings are filled with "0" at the begining. The function genind2df converts a genind back to such a data.frame. Alleles of a given locus can be coded as a single character string (with specified separators), or provided on different columns (see oneColPerAll argument).

Usage

df2genind(X, sep=NULL, ncode=NULL, ind.names=NULL, loc.names=NULL,
 pop=NULL, missing=NA, ploidy=2, type=c("codom","PA"))
genind2df(x,pop=NULL, sep="", usepop=TRUE, oneColPerAll=FALSE)

Arguments

X
a matrix or a data.frame (see decription)
sep
a character string separating alleles. See details.
ncode
an optional integer giving the number of characters used for coding one genotype at one locus. If not provided, this is determined from data.
ind.names
an optional character vector giving the individuals names; if NULL, taken from rownames of X.
loc.names
an optional character vector giving the markers names; if NULL, taken from colnames of X.
pop
an optional factor giving the population of each individual.
missing
can be NA, 0 or "mean". See details section.
ploidy
an integer indicating the degree of ploidy of the genotypes.
type
a character string indicating the type of marker: 'codom' stands for 'codominant' (e.g. microstallites, allozymes); 'PA' stands for 'presence/absence' markers (e.g. AFLP, RAPD).
x
a genind object
usepop
a logical stating whether the population (argument pop or x@pop should be used (TRUE, default) or not (FALSE)).
oneColPerAll
a logical stating whether alleles of one locus should be provided on separate columns (TRUE) rather than as a single character string (FALSE, default).

Value

  • an object of the class genind for df2genind; a matrix of biallelic genotypes for genind2df

encoding

UTF-8

Details

=== There are 3 treatments for missing values === - NA: kept as NA. - 0: allelic frequencies are set to 0 on all alleles of the concerned locus. Recommended for a PCA on compositionnal data. - "mean": missing values are replaced by the mean frequency of the corresponding allele, computed on the whole set of individuals. Recommended for a centred PCA. === Details for the sep argument === this character is directly used in reguar expressions like gsub, and thus require some characters to be preceeded by double backslashes. For instance, "/" works but "|" must be coded as "\\|".

See Also

import2genind, read.genetix, read.fstat, read.structure

Examples

Run this code
## simple example
df <- data.frame(locusA=c("11","11","12","32"),
locusB=c(NA,"34","55","15"),locusC=c("22","22","21","22"))
row.names(df) <- .genlab("genotype",4)
df

obj <- df2genind(df, ploidy=2)
obj
truenames(obj)

## converting a genind as data.frame 
genind2df(obj)
genind2df(obj, sep="/")
genind2df(obj, oneColPerAll=TRUE)

Run the code above in your browser using DataLab