Learn R Programming

rehh (version 3.2.2)

ihh2ihs: Compute iHS

Description

Compute iHS (standardized ratio of iHH values of two alleles).

Usage

ihh2ihs(
  scan,
  freqbin = 0.025,
  min_maf = 0.05,
  min_nhaplo = NA,
  standardize = TRUE,
  include_freq = FALSE,
  right = FALSE,
  alpha = 0.05,
  p.side = NA,
  p.adjust.method = "none",
  verbose = TRUE
)

Arguments

scan

a data frame with chromosome name, marker position, frequency of ancestral (resp. major) allele, frequency of derived (resp. minor) allele, and iHH for both alleles, as obtained from function scan_hh.

freqbin

size of the bins to standardize log(iHH_A/iHH_D). Markers are binned with respect to the derived allele frequency at the focal marker. The bins are built from min_maf to 1-min_maf in steps of size freqbin. If set to 0, standardization is performed considering each observed frequency as a discrete frequency class (useful in case of a large number of markers and few different haplotypes). If set to an integer of 1 or greater, a corresponding number of equally sized bins are created.

min_maf

focal markers with a MAF (Minor Allele Frequency) lower than or equal to min_maf are discarded from the analysis (default 0.05).

min_nhaplo

focal markers with least one of the two compared alleles carried by fewer than min_nhaplo haplotypes, are discarded (default NA).

standardize

logical. If TRUE (default), then standardize iHS, else report unstandardized iHS.

include_freq

logical. If TRUE include columns with allele frequencies into result.

right

logical. If TRUE the bin intervals are closed on the right (and open on the left).

alpha

calculate quantiles alpha/2 and (1-alpha/2) for unstandardized binned iHS.

p.side

side to which refers the p-value. Default NA, meaning two-sided. Can be set to "left" or "right".

p.adjust.method

method passed to function p.adjust to correct the p-value for multiple testing. Default "none".

verbose

logical. If TRUE (default), report number of markers of the source data frame and result data frame.

Value

The returned value is a list containing two elements

ihs

a data frame with markers in rows and the columns for chromosome name, marker position, iHS and, if standardized, p-value in a negative log10 scale. Optionally, allele frequencies are included.

frequency.class

a data frame with bins in rows and columns for the number of markers, mean uniHS, standard deviation uniHS, lower quantile uniHS, upper quantile uniHS.

Details

Computes log ratio of iHH of two focal alleles as described in Voight et al. (2006). The standardization is performed within each bins separately because of the frequency-dependence of expected iHS values under neutrality. An implicit assumption of this approach is that each bin is dominated by neutral markers.

Since the standardized iHS values follow, if markers evolve predominantly neutrally, approximately a standard Gaussian distribution, it is practical to assign to the values a p-value relative to the null-hypothesis of neutral evolution. The parameter p.side determines if the p-value is assigned to both sides of the distribution or to one side of interest.

References

Gautier, M. and Naves, M. (2011). Footprints of selection in the ancestral admixture of a New World Creole cattle breed. Molecular Ecology, 20, 3128-3143.

Voight, B.F. and Kudaravalli, S. and Wen, X. and Pritchard, J.K. (2006). A map of recent positive selection in the human genome. Plos Biology, 4, e72.

See Also

scan_hh, distribplot, freqbinplot, manhattanplot

Examples

Run this code
# NOT RUN {
library(rehh.data)
data(wgscan.cgu)
#results from a genome scan (44,057 SNPs)
#see ?wgscan.eut and ?wgscan.cgu for details
wgscan.cgu.ihs <- ihh2ihs(wgscan.cgu)
# }

Run the code above in your browser using DataLab