Function to run QTL analysis using IBD probabilties given (possibly replicated) phenotypes, assuming randomised experimental design
QTLscan(
IBD_list,
Phenotype.df,
genotype.ID,
trait.ID,
block = NULL,
cofactor_df = NULL,
allelic_interaction = FALSE,
folder = NULL,
filename.short,
prop_Pheno_rep = 0.5,
perm_test = FALSE,
N_perm.max = 1000,
alpha = 0.05,
gamma = 0.05,
ncores = 1,
log = NULL,
verbose = TRUE,
...
)
A nested list; each list element (per linkage group) contains the following items:
Single matrix of QTL results with columns chromosome, position, LOD, adj.r.squared and PVE (percentage variance explained).
If perm_test
= FALSE
, this will be NULL
.
Otherwise, Perm.res contains a list of the results of the permutation test, with list items
"quantile","threshold" and "scores". Quantile refers to which quantile of scores was used to determine the threshold.
Note that scores are each of the maximal LOD scores across the entire genome scan per permutation, thus returning a
genome-wide threshold rather than a chromosome-specific threshold. If the latter is preferred, restricting the
IBD_list
to a single chromosome and re-running the permutation test will provide the desired threshold.
If a blocking factor or co-factors are used, this is the (named) vector of residuals used as input for the QTL scan. Otherwise, this is the set of (raw) phenotypes used in the QTL scan.
Original map of genetic marker positions upon which the IBDs were based, most often used for adding rug of marker positions to QTL plots.
Names of the linkage groups
Whether argument allelic_interaction
was TRUE
or FALSE
in the QTL scan
List of IBD probabilities
A data.frame containing phenotypic values
The colname of Phenotype.df
that contains the offspring identifiers (F1 names)
The colname of Phenotype.df
that contains the response variable to use in the model
The blocking factor to be used, if any (must be colname of Phenotype.df
). By default NULL
, in which case no blocking structure (for unreplicated experiments)
A 3-column data frame of co-factor(s); column 1 gives the numeric linkage group identifier(s),
column 2 specifies the cM position of the co-factor(s), column 3 specifies whether the QTL was fitted using "a" = additive effects or
"f" = full allelic interactions (note that any other symbol for the full model will also be accepted, as long as it is not "a").
For backward compatibility with package versions <= 0.0.9, it is possible to just supply the first two columns,
in which case an additive-effects model is assumed for each cofactor (so, a third column will be automatically filled with "a").
By default cofactor_df = NULL
, in which case no co-factors are included in the analysis.
The QTL detection model can be for additive main effects only (by default allelic_interaction = FALSE
). If TRUE
, then the full model is used
(i.e. all possible genotype combinations are included as predictors in the model). This runs the risk of overfitting, especially if double reduction was also allowed.
Both types of analyses can ideally be performed and compared. Note that if IBD probabilities were estimated using the "heuristic" method rather than the HMM method
(see estimate_IBD
), then IBDs are actually haplotype probabilities rather than genotype probabilities, meaning that allelic interaction effects cannot be included in the model.
If markers are to be used as co-factors, the path to the folder in which the imported IBD probabilities is contained can be provided here.
By default this is NULL
, if files are in working directory.
If TetraOrigin was used and co-factors are being included, the shortened stem of the filename of the .csv
files containing the output of TetraOrigin,
i.e. without the tail "_LinkageGroupX_Summary.csv" which is added by default to all output of TetraOrigin.
The minimum proportion of phenotypes represented across blocks. If less than this, the individual is removed from the analysis. If there is incomplete data, the missing phenotypes are imputed using the mean values across the recorded observations.
Logical, by default FALSE
. If TRUE
, a permutation test will be performed to determine a
genome-wide significance threshold.
The maximum number of permutations to run if perm_test
is TRUE
; by default this is 1000.
The P-value to be used in the selection of a threshold if perm_test
is TRUE
, by default 0.05 (i.e. the 0.95 quantile).
The width of the confidence intervals used around the permutation test threshold using the approach of Nettleton & Doerge (2000), by default 0.05.
Number of cores to use if parallel computing is required. Works both for Windows and UNIX (using doParallel
).
Use parallel::detectCores()
to find out how many cores you have available.
Character string specifying the log filename to which standard output should be written. If NULL
log is send to stdout.
Logical, by default TRUE
. Should messages be printed during running?
Arguments passed to plot
data("IBD_4x","Phenotypes_4x")
qtl_LODs.4x <- QTLscan(IBD_list = IBD_4x,
Phenotype.df = Phenotypes_4x,
genotype.ID = "geno",
trait.ID = "pheno",
block = "year")
Run the code above in your browser using DataLab