Compute iHS (standardized ratio of iHH values of two alleles).
ihh2ihs(
scan,
freqbin = 0.025,
min_maf = 0.05,
min_nhaplo = NA,
standardize = TRUE,
include_freq = FALSE,
right = FALSE,
alpha = 0.05,
p.side = NA,
p.adjust.method = "none",
verbose = TRUE
)
a data frame with chromosome name,
marker position, frequency of ancestral (resp. major) allele, frequency of derived (resp. minor)
allele, and iHH for both alleles, as obtained from function scan_hh
.
size of the bins to standardize log(iHH_A/iHH_D). Markers are binned with
respect to the derived allele frequency at the focal marker. The bins are built from
min_maf
to 1-min_maf
in steps of size freqbin
. If set to 0, standardization
is performed considering each observed frequency as a discrete frequency
class (useful in case of a large number of markers and few different haplotypes).
If set to an integer of 1 or greater, a corresponding number of equally sized bins are created.
focal markers with a MAF (Minor Allele Frequency) lower than or equal to min_maf
are discarded from the analysis (default 0.05).
focal markers with least one of the two compared alleles carried by fewer
than min_nhaplo
haplotypes, are discarded (default NA
).
logical. If TRUE
(default), then standardize iHS, else report unstandardized iHS.
logical. If TRUE
include columns with allele frequencies into result.
logical. If TRUE
the bin intervals are closed on the right (and open on the left).
calculate quantiles alpha/2
and (1-alpha/2)
for unstandardized binned iHS.
side to which refers the p-value. Default NA
, meaning two-sided. Can be set
to "left"
or "right"
.
method passed to function p.adjust
to correct the p-value for
multiple testing. Default "none"
.
logical. If TRUE
(default), report number of markers of the source data frame and result data frame.
The returned value is a list containing two elements
a data frame with markers in rows and the columns for chromosome name, marker position, iHS and, if standardized, p-value in a negative log10 scale. Optionally, allele frequencies are included.
a data frame with bins in rows and columns for the number of markers, mean uniHS, standard deviation uniHS, lower quantile uniHS, upper quantile uniHS.
Computes log ratio of iHH of two focal alleles as described in Voight et al. (2006). The standardization is performed within each bins separately because of the frequency-dependence of expected iHS values under neutrality. An implicit assumption of this approach is that each bin is dominated by neutral markers.
Since the standardized iHS values follow, if markers evolve predominantly neutrally, approximately
a standard Gaussian distribution, it is practical to assign to the values a p-value relative
to the null-hypothesis of neutral evolution. The parameter p.side
determines
if the p-value is assigned to both sides of the distribution or to one side of interest.
Gautier, M. and Naves, M. (2011). Footprints of selection in the ancestral admixture of a New World Creole cattle breed. Molecular Ecology, 20, 3128-3143.
Voight, B.F. and Kudaravalli, S. and Wen, X. and Pritchard, J.K. (2006). A map of recent positive selection in the human genome. Plos Biology, 4, e72.
# NOT RUN {
library(rehh.data)
data(wgscan.cgu)
#results from a genome scan (44,057 SNPs)
#see ?wgscan.eut and ?wgscan.cgu for details
wgscan.cgu.ihs <- ihh2ihs(wgscan.cgu)
# }
Run the code above in your browser using DataLab