seqhap: Sequential Haplotype Scan Association Analysis for Case-Control Data

Description

Seqhap implements sequential haplotype scan methods to perform association analyses for case-control data. When evaluating each locus, loci that contribute additional information to haplotype associations with disease status will be added sequentially. This conditional evaluation is based on the Mantel-Haenszel (MH) test. Two sequential methods are provided, a sequential haplotype method and a sequential summary method, as well as results based on the traditional single-locus method. Currently, seqhap only works with bialleleic loci (single nucleotide polymorphisms, or SNPs) and binary traits.

Usage

seqhap(y, geno, pos, locus.label=NA, weight=NULL, 
       mh.threshold=3.84, r2.threshold=0.95, haplo.freq.min=0.005, 
       miss.val=c(0, NA), sim.control=score.sim.control(),
       control=haplo.em.control())
## S3 method for class 'seqhap':
print(x, digits=max(options()$digits-2, 5), ...)

Arguments

vector of binary response (1=case, 0=control). The length is equal to the number of rows in geno.

geno

matrix of alleles, such that each locus has a pair of adjacent columns of alleles, and the order of columns corresponds to the order of loci on a chromosome. If there are K loci, then ncol(geno)=2*K. Rows represent the alleles for each subject. Currently,

pos

vector of physical positions (or relative physical positions) for loci. If there are K loci, length(pos)=K. The scale (in kb, bp, or etc.) doesn't affect the results.

locus.label

vector of labels for the set of loci

weight

weights for observations (rows of geno matrix).

mh.threshold

threshold for the Mantel-Haenszel statistic that evaluates whether a locus contributes additional information of haplotype association to disease, conditional on current haplotypes. The default is 3.84, which is the 95th percentile of the chi-square distr

r2.threshold

threshold for a locus to be skipped. When scanning locus k, loci with correlations r-squared (the square of the Pearson's correlation) greater than r2.threshold with locus k will be ignored, so that the haplotype growing process continues for markers that

haplo.freq.min

the minimum haplotype frequency for a haplotype to be included in the association tests. The haplotype frequency is based on the EM algorithm that estimates haplotype frequencies independent of trait.

miss.val

vector of values that represent missing alleles.

sim.control

A list of control parameters to determine how simulations are performed for permutation p-values, similar to the strategy in haplo.score. The list is created by the function score.sim.control and the default values of this function can be changed as des

control

A list of parameters that control the EM algorithm for estimating haplotype frequencies when phase is unknown. The list is created by the function haplo.em.control - see this function for more details.

a seqhap object to print

digits

Number of significant digits to print for numeric values

...

Additional parameters for the print method

Value

list with components:
convergeindicator of convergence of the EM algorithm (see haplo.em); 1 = converge, 0=failed
locus.labelvector of labels for loci
poschromosome positions for loci, same as input.
n.simnumber of permutations performed for emperical p-values
inlistmatrix that shows which loci are combined for association analysis in the sequential scan. The non-zero values of the kth row of inlist are the indices of the loci combined when scanning locus k.
chi.statchi-square statistics of single-locus analysis.
chi.p.pointpermuted pointwise p-values of single-locus analysis.
chi.p.regionpermuted regional p-value of single-locus analysis.
hap.statchi-square statistics of sequential haplotype analysis.
hap.dfdegrees of freedom of sequential haplotype analysis.
hap.p.pointpermuted pointwise p-values of sequential haplotype analysis.
hap.p.regionpermuted region p-value of sequential haplotype analysis.
sum.statchi-square statistics of sequential summary analysis.
sum.dfdegrees of freedom of sequential summary analysis.
sum.p.pointpermuted pointwise p-values of sequential summary analysis.
sum.p.regionpermuted regional p-value of sequential summary analysis.

References

Yu Z, Schaid DJ. (2007) Sequential haplotype scan methods for association analysis. Genet Epidemiol, in print.

Details

No further details

Examples

Run this code

# load example data with response and genotypes. 
data(seqhap.dat)
mydata.y <- seqhap.dat[,1]
mydata.x <- seqhap.dat[,-1]
# load positions
data(seqhap.pos)
pos <- seqhap.pos$pos
# run seqhap with default settings
# this example takes 5-10 seconds to run
  myobj <- seqhap(y=mydata.y, geno=mydata.x, pos=pos)
  print.seqhap(myobj)

Run the code above in your browser using DataLab