Last chance! 50% off unlimited learning
Sale ends in
A function to fix (impute) missing birth years in pedigree.
pedFixBirthYear(x, interval, down = FALSE, na.rm = TRUE, sort = TRUE,
direct = TRUE, report = TRUE, colId = 1, colFid = 2,
colMid = 3, colBY = 4)
data.frame , with (at least) the following columns: individual, father, and mother identification,
and year of birth; see arguments colId
,
colFid
, colMid
, and colBY
Numeric, a value for generation interval in years.
Logical, the default is to impute birth years based on the birth year of children
starting from the youngest to the oldest individuals, while with down=TRUE
birth year is imputed based on the birth year of parents in the opposite order.
Logical, remove NA
values when searching for the minimal (maximal) year of birth
in children (parents); setting this to FALSE
can lead to decreased success of
imputation
Logical, initially sort x
using orderPed()
so that children follow
parents in order to make imputation as optimal as possible (imputation is performed
within a loop from the first to the last unknown birth year); at the end original
order is restored.
Logical, insert inferred birth years immediately so they can be used for successive individuals within the loop.
Logical, report success.
Numeric or character, position or name of a column holding individual identification.
Numeric or character, position or name of a column holding father identification.
Numeric or character, position or name of a column holding mother identification.
Numeric or character, position or name of a column holding birth year.
Object x
with imputed birth years based on the birth year of children or parents.
If report=TRUE
success is printed on the screen as the number of initially, fixed,
and left unknown birth years is printed.
Warnings are issued when there is no information to use to impute birth years or missing
values (NA
) are propagated.
Arguments down
and na.rm
allow for repeated use of this function, i.e., with
down=FALSE
and with down=TRUE
(both in combination with na.rm=TRUE
) in order to
propagate information over the pedigree until "convergence".
This function can be very slow on large pedigrees with extensive missingness of birth years.
orderPed
in pedigree package
# NOT RUN {
## Example pedigree with missing (unknown) birth year for some individuals
ped0 <- data.frame( id=c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14),
fid=c( 0, 0, 0, 1, 1, 1, 3, 3, 3, 5, 4, 0, 0, 12),
mid=c( 0, 0, 0, 2, 0, 2, 2, 2, 5, 0, 0, 0, 0, 13),
birth_dt=c(NA, 0, 1, NA, 3, 3, 3, 3, 4, 4, 5, NA, 6, 6) + 2000)
## First run - using information from children
ped1 <- pedFixBirthYear(x=ped0, interval=1)
## Second run - using information from parents
ped2 <- pedFixBirthYear(x=ped1, interval=1, down=TRUE)
## Third run - using information from children, but with no success
ped3 <- pedFixBirthYear(x=ped2, interval=1)
# }
Run the code above in your browser using DataLab