powerGE(n, power, model, caco, alpha, alpha1, maintain.alpha)
n
and power
should be specified.n
and power
should be specified.prev
Prevalence of the outcome in the population. Note that for case-only and empirical Bayes estimators to be valid,
the prevalence needs to be low.
pGene
Probability that a binary SNP is 1 (i.e. not the minor allele frequency for a three level SNP).
pEnv
Frequency of the binary environmental variable.
orGE
Odds ratio between the binary SNP and binary environmental variable.
beta.LOR
Vector of length three with the odds ratios of the genetic, environmental, and GxE interaction effect, respectively.
nSNP
Number of SNPs (genes) being tested.True
: combinations that do not maintain the Type 1 error are not computed. If maintain.alpha
is False
all combinations are computed.n
was specified.power
was specified.After screening, the SNPs that pass the screen can be tested using
If screening took place using the correlation or chi-square screening, the Type 1 error won't be maintained if the final GxE testing is carried out using either the case-only or empirical Bayes estimator. See Dai et al. (2012). The cocktail screening maintains the Type 1 family wise error rate, since only those SNPs that pass on to the second stage using marginal screening will use the case-only or empirical Bayes estimator, the SNPs that pass on to the second stage using correlation screening will always use the case-control estimator.
When SNP and environment are correlated in the population (i.e. model$orGE
does not equal 1) the case-only estimator does not maintain the Type 1 error.
The empirical Bayes estimator may also have a moderately inflated Type 1 error. When the disease is common either the case-only
estimator or the empirical Bayes estimator also may not estimate the GxE interaction.
Power calculations are described in Kooperberg, Dai, and Hsu (2014). Briefly, for a given genetic model we compute the expected p-values for all
screening statistics. We then use a normal approximation to compute the probability that this SNP passes the screening (e.g., if alpha1
equaled this expected p-value this probability would be exactly 0.5), and combine this with power calculations for the second stage of GxE testing.
Gauderman WJ, Zhang P, Morrison JL, Lewinger JP (2013). Finding novel genes by testing GxE interactions in a genome-wide association study. Genetic Epidemiology, 37, 603-613.
Hsu L, Jiao S, Dai JY, Hutter C, Peters U, Kooperberg C (2012). Powerful cocktail methods for detecting genome-wide gene-environment interaction. Genetic Epidemiology, 36, 183-194.
Kooperberg C, Dai, JY, Hsu L (2014). Two-stage procedures for the identification of gene x environment and gene x gene interactions in genome-wide association studies. To appear.
Kooperberg C, LeBlanc ML (2008). Increasing the power of identifying gene x gene interactions in genome-wide association studies. Genetic Epidemiology, 32, 255-263.
Mukherjee B, Chatterjee N (2008). Exploiting gene-environment inde- pendence for analysis of case-control studies: an empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency Biometrics, 64, 685-694.
Murcray CE, Lewinger JP, Gauderman WJ (2009). Gene-environment interaction in genome-wide association studies. American Journalk of Epidemiology, 169, 219-226.
mod1 <- list(prev=0.01,pGene=0.2,pEnv=0.2,beta.LOR=log(c(1.0,1.2,1.4)),orGE=1.2,nSNP=10^6)
results <- powerGE(n=20000, model=mod1,alpha1=.01)
print(results)
mod2 <- list(prev=0.01,pGene=0.2,pEnv=0.2,beta.LOR=log(c(1.0,1.0,1.4)),orGE=1,nSNP=10^6)
results <- powerGE(power=0.8, model=mod2,alpha1=.01)
print(results)
Run the code above in your browser using DataLab