Learn R Programming

GWASTools (version 1.18.0)

assocRegression: Association testing with regression

Description

Run association testing with regression

Usage

assocRegression(genoData, outcome, model.type = c("linear", "logistic", "poisson", "firth"), gene.action = c("additive", "dominant", "recessive"), covar = NULL, ivar = NULL, scan.exclude = NULL, CI = 0.95, robust = FALSE, LRtest = FALSE, PPLtest = TRUE, effectAllele = c("minor", "alleleA"), snpStart = NULL, snpEnd = NULL, block.size = 5000, verbose = TRUE)

Arguments

genoData
a GenotypeData object
outcome
the name of the phenotype of interest (a column in the scan annotation of genoData)
model.type
the type of model to be run. "linear" uses lm, "logistic" uses glm with family=binomial(), "poisson" uses glm with family=poisson(), and "firth" uses logistf.
gene.action
"additive" coding sets the marker variable for homozygous minor allele samples = 2, heterozygous samples = 1, and homozygous major allele samples = 0. "dominant" coding sets the marker variable for homozygous minor allele samples = 2, heterozygous samples = 2, and homozygous major allele samples = 0. "recessive" coding sets the marker variable for homozygous minor allele samples = 2, heterozygous samples = 0, and homozygous major allele samples = 0. (If effectAllele="alleleA", the coding reflects alleleA instead of the minor allele.)
covar
a vector of the names of the covariates to adjust for (columns in the scan annotation of genoData)
ivar
the name of the variable in covar to include as an interaction with genotype
scan.exclude
a vector of scanIDs for scans to exclude
CI
a value between 0 and 1 defining the confidence level for the confidence interval calculations
robust
logical for whether to use sandwich-based robust standard errors for the "linear" or "logistic" method. The default value is FALSE, and uses model based standard errors. The standard error estimates are returned and also used for Wald Tests of significance.
LRtest
logical for whether to perform Likelihood Ratio Tests in addition to Wald tests (which are always performed). Applies to linear, logistic, or poisson main effects only. NOTE: Performing the LR tests adds a noticeable amount of computation time.
PPLtest
logical for whether to use the profile penalized likelihood to compute p values for the "firth" method (in addition to Wald tests, which are always performed).
effectAllele
whether the effects should be returned in terms of the minor allele for the tested sample (effectAllele="minor") or the allele returned by getAlleleA(genoData) (effectAllele="alleleA"). If the minor allele is alleleB for a given SNP, the difference between these two options will be a sign change for the beta estimate.
snpStart
index of the first SNP to analyze, defaults to first SNP
snpEnd
index of the last SNP to analyze, defaults to last SNP
block.size
number of SNPs to read in at once
verbose
logical for whether to print status updates

Value

a data.frame with some or all of the following columns:
snpID
the snpIDs
chr
chromosome SNPs are on
effect.allele
which allele ("A" or "B") is the effect allele
EAF
effect allele frequency
MAF
minor allele frequency
n
number of samples used to analyze each SNP
n0
number of controls (outcome=0) used to analyze each SNP
n1
number of cases (outcome=1) used to analyze each SNP
Est
beta estimate for genotype
SE
standard error of beta estimate for the genotype
LL
Lower limit of confidence interval for Est
UL
Upper limit of confidence interval for Est
Wald.Stat
chi-squared test statistic for association
Wald.pval
p-value for association
LR.Stat
likelihood ratio test statistic for association
LR.pval
p-value for association
PPL.Stat
profile penalized likelihood test statistic for association
PPL.pval
p-value for association
GxE.Est
beta estimate for the genotype*ivar interaction parameter (NA if this parameter is a factor with >2 levels)
GxE.SE
standard error of beta estimate for the genotype*ivar interaction parameter
GxE.Stat
Wald test statistic for the genotype*ivar interaction parameter
GxE.pval
Wald test p-value for the genotype*ivar interaction parameter
Joint.Stat
Wald test statistic for jointly testing all genotype parameters
Joint.pval
Wald test p-value for jointly testing all genotype parameters

Details

When using models without interaction terms, the association tests compare the model including the covariates and genotype value to the model including only the covariates (a test of genotype effect). When using a model with an interaction term, tests are performed for the interaction term separately as well as a joint test of all the genotype terms (main effects and interactions) to detect any genotype effect. All tests and p-values are always computed using Wald tests with p-values computed from Chi-Squared distribtuions. The option of using either sandwich based robust standard errors (which make no model assumptions) or using model based standard errors for the confidence intervals and Wald tests is specified by the robust parameter. The option of also performing equivalent Likelihood Ratio tests is available and is specified by the LRtest parameter. For logistic regression models, if the SNP is monomorphic in either cases or controls, then the slope parameter is not well-defined, and the result will be NA.

Note: Y chromosome SNPs must be analyzed separately because they only use males.

See Also

GenotypeData, lm, glm, logistf, vcovHC, lrtest

Examples

Run this code
library(GWASdata)
data(illuminaScanADF)
scanAnnot <- illuminaScanADF

# exclude duplicated subjects
scan.exclude <- scanAnnot$scanID[scanAnnot$duplicated]

# create some variables for the scans
scanAnnot$sex <- as.factor(scanAnnot$sex)
scanAnnot$age <- rnorm(nrow(scanAnnot), mean=40, sd=10)
scanAnnot$case.cntl.status <- rbinom(nrow(scanAnnot), 1, 0.4)
scanAnnot$blood.pressure[scanAnnot$case.cntl.status==1] <- rnorm(sum(scanAnnot$case.cntl.status==1), mean=100, sd=10)
scanAnnot$blood.pressure[scanAnnot$case.cntl.status==0] <- rnorm(sum(scanAnnot$case.cntl.status==0), mean=90, sd=5)

# create data object
gdsfile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
gds <- GdsGenotypeReader(gdsfile)
genoData <-  GenotypeData(gds, scanAnnot=scanAnnot)

## linear regression
res <- assocRegression(genoData,
		       outcome="blood.pressure",
                       model.type="linear",
                       covar=c("sex", "age"),
                       scan.exclude=scan.exclude,
 		       snpStart=1, snpEnd=100)

## logistic regression
res <- assocRegression(genoData,
		       outcome="case.cntl.status",
                       model.type="logistic",
                       covar=c("sex", "age"),
                       scan.exclude=scan.exclude,
 		       snpStart=1, snpEnd=100)

close(genoData)

Run the code above in your browser using DataLab