Seqhap implements sequential haplotype scan methods to perform association analyses for case-control data. When evaluating each locus, loci that contribute additional information to haplotype associations with disease status will be added sequentially. This conditional evaluation is based on the Mantel-Haenszel (MH) test. Two sequential methods are provided, a sequential haplotype method and a sequential summary method, as well as results based on the traditional single-locus method. Currently, seqhap only works with bialleleic loci (single nucleotide polymorphisms, or SNPs) and binary traits.
seqhap(y, geno, pos, locus.label=NA, weight=NULL,
mh.threshold=3.84, r2.threshold=0.95, haplo.freq.min=0.005,
miss.val=c(0, NA), sim.control=score.sim.control(),
control=haplo.em.control())
# S3 method for seqhap
print(x, digits=max(options()$digits-2, 5), ...)
list with components:
indicator of convergence of the EM algorithm (see haplo.em); 1 = converge, 0=failed
vector of labels for loci
chromosome positions for loci, same as input.
number of permutations performed for emperical p-values
matrix that shows which loci are combined for association analysis in the sequential scan. The non-zero values of the kth row of inlist are the indices of the loci combined when scanning locus k.
chi-square statistics of single-locus analysis.
permuted pointwise p-values of single-locus analysis.
permuted regional p-value of single-locus analysis.
chi-square statistics of sequential haplotype analysis.
degrees of freedom of sequential haplotype analysis.
permuted pointwise p-values of sequential haplotype analysis.
permuted region p-value of sequential haplotype analysis.
chi-square statistics of sequential summary analysis.
degrees of freedom of sequential summary analysis.
permuted pointwise p-values of sequential summary analysis.
permuted regional p-value of sequential summary analysis.
vector of binary response (1=case, 0=control). The length is equal to the number of rows in geno.
matrix of alleles, such that each locus has a pair of adjacent columns of alleles, and the order of columns corresponds to the order of loci on a chromosome. If there are K loci, then ncol(geno)=2*K. Rows represent the alleles for each subject. Currently, only bi-allelic loci (SNPs) are allowed.
vector of physical positions (or relative physical positions) for loci. If there are K loci, length(pos)=K. The scale (in kb, bp, or etc.) doesn't affect the results.
vector of labels for the set of loci
weights for observations (rows of geno matrix).
threshold for the Mantel-Haenszel statistic that evaluates whether a locus contributes additional information of haplotype association to disease, conditional on current haplotypes. The default is 3.84, which is the 95th percentile of the chi-square distribution with 1 degree of freedom.
threshold for a locus to be skipped. When scanning locus k, loci with correlations r-squared (the square of the Pearson's correlation) greater than r2.threshold with locus k will be ignored, so that the haplotype growing process continues for markers that are further away from locus k.
the minimum haplotype frequency for a haplotype to be included in the association tests. The haplotype frequency is based on the EM algorithm that estimates haplotype frequencies independent of trait.
vector of values that represent missing alleles.
A list of control parameters to determine how simulations are performed for permutation p-values, similar to the strategy in haplo.score. The list is created by the function score.sim.control and the default values of this function can be changed as desired. Permutations are performed until a p.threshold accuracy rate is met for the three region-based p-values calculated in seqhap. See score.sim.control for details.
A list of parameters that control the EM algorithm for estimating haplotype frequencies when phase is unknown. The list is created by the function haplo.em.control - see this function for more details.
a seqhap object to print
Number of significant digits to print for numeric values
Additional parameters for the print method
No further details
Yu Z, Schaid DJ. (2007) Sequential haplotype scan methods for association analysis. Genet Epidemiol, in print.
haplo.em
,
print.seqhap
,
plot.seqhap
,
score.sim.control
# load example data with response and genotypes.
data(seqhap.dat)
mydata.y <- seqhap.dat[,1]
mydata.x <- seqhap.dat[,-1]
# load positions
data(seqhap.pos)
pos <- seqhap.pos$pos
# run seqhap with default settings
if (FALSE) {
# this example takes 5-10 seconds to run
myobj <- seqhap(y=mydata.y, geno=mydata.x, pos=pos)
print.seqhap(myobj)
}
Run the code above in your browser using DataLab