Identify pairs of individuals likely to be related, but not assigned as such in the provided pedigree.
GetMaybeRel(
GenoM = NULL,
SeqList = NULL,
Pedigree = NULL,
LifeHistData = NULL,
AgePrior = NULL,
Module = "par",
Complex = "full",
Herm = "no",
Err = 1e-04,
ErrFlavour = "version2.9",
Tassign = 0.5,
Tfilter = -2,
MaxPairs = 7 * nrow(GenoM),
quiet = FALSE,
ParSib = NULL,
MaxMismatch = NA
)
A list with
A dataframe with non-assigned likely parent-offspring pairs, with columns:
ID1
ID2
TopRel: the most likely relationship, using abbreviations listed below
LLR: Log10-Likelihood Ratio between most likely and next most likely relationship
OH: Number of loci at which the two individuals are opposite homozygotes
BirthYear1: Birth year of ID1 (copied from LifeHistData)
BirthYear2
AgeDif: Age difference; BirthYear1 - BirthYear2
Sex1: Sex of ID1 (copied from LifeHistData)
Sex2
SnpdBoth: Number of loci at which the two individuals are both successfully genotyped
A dataframe with non-assigned likely pairs of relatives,
with columns identical to MaybePar
A dataframe with non-assigned parent-parent-offspring trios, with columns:
ID
parent1
parent2
TopRel: the most likely relationship, using abbreviations listed below
LLRparent1: Log10-Likelihood Ratio between parent1 being a parent of ID vs the next most likely relationship between the pair, ignoring parent2
LLRparent2: as LLRparent1
LLRpair: LLR for the parental pair, versus the next most likely configuration between the three individuals (with one or neither parent assigned)
OHparent1: Number of loci at which ID and parent1 are opposite homozygotes
OHparent2: as OHparent1
MEpair: Number of Mendelian errors between the offspring and the parent pair, includes OH as well as e.g. parents being opposing homozygotes, but the offspring not being a heterozygote. The offspring being OH with both parents is counted as 2 errors.
SNPd.id.parent1: Number of loci at which ID and parent1 are both successfully genotyped
SNPd.id.parent2: as SNPd.id.parent1
The following categories are used in column 'TopRel', indicating the most likely relationship category:
Parent-Offspring
Full Siblings
Half Siblings
GrandParent - grand-offspring
Full Avuncular (aunt/uncle)
2nd degree relatives, not enough information to distinguish between HS,GP and FA
Unclear, but probably 1st, 2nd or 3rd degree relatives
numeric matrix with genotype data: One row per individual,
one column per SNP, coded as 0, 1, 2, missing values as a negative number
or NA. You can reformat data with GenoConvert
, or use other
packages to get it into a genlight object and then use as.matrix
.
list with output from sequoia
.
SeqList$Pedigree
is used if present, and SeqList$PedigreePar
otherwise, and overrides the input parameter Pedigree
. If 'Specs' is
present, its elements override all input parameters with the same name. The
list elements `LifeHist', `AgePriors', and `ErrM' are also used if
present, and similarly override the corresponding input parameters.
dataframe with id - dam - sire in columns 1-3. May include
non-genotyped individuals, which will be treated as dummy individuals. When
provided, all likelihoods (and thus all maybe-relatives) are conditional on
this pedigree. Note: SeqList$Pedigree
or SeqList$PedigreePar
take
precedent (for this function only).
data.frame with up to 6 columns:
max. 30 characters long
1 = female, 2 = male, 3 = unknown, 4 = hermaphrodite, other numbers or NA = unknown
birth or hatching year, integer, with missing values as NA or any negative number.
minimum birth year, only used if BirthYear is missing
maximum birth year, only used if BirthYear is missing
Last year in which individual could have had offspring. Can e.g. in mammals be the year before death for females, and year after death for males.
"Birth year" may be in any arbitrary discrete time unit relevant to the species (day, month, decade), as long as parents are never born in the same time unit as their offspring, and only integers are used. Individuals do not need to be in the same order as in `GenoM', nor do all genotyped individuals need to be included.
Agepriors matrix, as generated by MakeAgePrior
and included in the sequoia
output. Affects which
relationships are considered possible (only those where \(P(A|R) / P(A)
> 0\)).
type of relatives to check for. One of
parent - offspring pairs
all first and second degree relatives
When 'par', all pairs are returned that are more likely parent-offspring than unrelated, potentially including pairs that are even more likely to be otherwise related.
Breeding system complexity. Either "full" (default), "simp" (simplified, no explicit consideration of inbred relationships), "mono" (monogamous).
Hermaphrodites, either "no", "A" (distinguish between dam and sire role, default if at least 1 individual with sex=4), or "B" (no distinction between dam and sire role). Both of the latter deal with selfing.
estimated genotyping error rate, as a single number, or a length 3
vector with P(hom|hom), P(het|hom), P(hom|het), or a 3x3 matrix. See
details below. The error rate is presumed constant across SNPs, and
missingness is presumed random with respect to actual genotype. Using
Err
>5% is not recommended, and Err
>10% strongly
discouraged.
function that takes Err
(single number) as input,
and returns a length 3 vector or 3x3 matrix, or choose from inbuilt options
'version2.9', 'version2.0', 'version1.3', or 'version1.1', referring to the
sequoia version in which they were the default. Ignored if Err
is a
vector or matrix. See ErrToM
for details.
minimum LLR required for acceptance of proposed relationship, relative to next most likely relationship. Higher values result in more conservative assignments. Must be zero or positive.
threshold log10-likelihood ratio (LLR) between a proposed relationship versus unrelated, to select candidate relatives. Typically a negative value, related to the fact that unconditional likelihoods are calculated during the filtering steps. More negative values may decrease non-assignment, but will increase computational time.
the maximum number of putative pairs to return.
logical, suppress messages.
DEPRECATED, use Module
either 'par' to check
for putative parent-offspring pairs only, or 'sib' to check for all types
of first and second degree relatives.
DEPRECATED AND IGNORED. Now calculated
automatically using CalcMaxMismatch
.
When Module="par"
, the age difference of the putative pair is
temporarily set to NA so that genetic parent-offspring pairs declared to be
born in the same year may be discovered. When Module="ped"
, only
relationships possible given the age difference, if known from the
LifeHistData, are considered.
sequoia
to identify likely pairs of duplicate
genotypes and for pedigree reconstruction; GetRelM
to
identify all pairs of relatives in a pedigree; CalcPairLL
for
the likelihoods underlying the LLR.
if (FALSE) {
# without conditioning on pedigree
MaybeRel_griffin <- GetMaybeRel(GenoM=Geno_griffin, Err=0.001, Module='par')
}
names(MaybeRel_griffin)
# conditioning on pedigree
MaybePO <- GetMaybeRel(GenoM = Geno_griffin, SeqList = SeqOUT_griffin,
Module = 'par')
head(MaybePO$MaybePar)
# instead of providing the entire SeqList, one may specify the relevant
# elements separately
Maybe <- GetMaybeRel(GenoM = Geno_griffin,
Pedigree = SeqOUT_griffin$PedigreePar,
LifeHistData = LH_griffin,
Err=0.0001, Complex = "full",
Module = "ped")
head(Maybe$MaybeRel)
# visualise results, turn dataframe into matrix first:
MaybeM <- GetRelM(Pairs = Maybe$MaybeRel)
PlotRelPairs(MaybeM)
# or combine with pedigree (note suffix '?')
RelM <- GetRelM(Pedigree =SeqOUT_griffin$PedigreePar, Pairs = Maybe$MaybeRel)
PlotRelPairs(RelM)
Run the code above in your browser using DataLab