Determine candidate regions of selection.
calc_candidate_regions(
scan,
threshold = NA,
pval = FALSE,
ignore_sign = FALSE,
window_size = 1e+06,
overlap = 0,
right = TRUE,
min_n_mrk = 1,
min_n_extr_mrk = 1,
min_perc_extr_mrk = 0,
join_neighbors = TRUE
)
boundary score above which markers are defined as "extreme".
logical. If TRUE
use the (negative log-) p-value instead of the score.
logical. If TRUE
(default), take absolute values of score.
size of sliding windows. If set to 1, no windows are constructed and only the individual extremal markers are reported.
size of window overlap (default 0, i.e. no overlap).
logical, indicating if the windows should be closed on the right (and open on the left) or vice versa.
minimum number of markers per window.
minimum number of markers with extreme value in a window.
minimum percentage of extremal markers among all markers.
logical. If TRUE
(default), merge neighboring windows with
extreme values.
A data frame with chromosomal regions, i.e. windows that fulfill the necessary conditions to qualify as candidate regions under selection. For each region the overall number of markers, their mean and maximum, the number of markers with extremal values, their percentage of all markers and their average are reported.
There is no generally agreed method how to determine genomic regions which might have been under recent selection. Since selection tends to yield clusters of markers with outlier values, a common approach is to search for regions with an elevated number or fraction of outlier or extremal markers. This function allows to set three conditions a window must fulfill in order to classify as candidate region:
min_n_mrk
a minimum number of (any) markers.
min_n_extr_mrk
a minimum number of markers with outlier / extreme value.
min_perc_extr_mrk
a minimum percentage of extremal markers among all markers.
"Extreme" markers are defined by having a score above the specified threshold
.