Learn R Programming

minfi (version 1.18.4)

gaphunter: Find gap signals in 450k data

Description

This function finds probes in the Illumina 450k Array for which calculated beta values cluster into distinct groups separated by a defined threshold. It identifies, for these ‘gaps signals’ the number of groups, the size of these groups, and the samples in each group.

Usage

gaphunter(object, threshold=0.05, keepOutliers=FALSE, outCutoff=0.01, verbose=TRUE)

Arguments

object
An object of class (Genomic)RatioSet, (Genomic)MethylSet, or matrix. If one of the first two, codegetBeta is used to calculate beta values. If a matrix, must be one of beta values.
threshold
The difference in consecutive, ordered beta values that defines the presence of a gap signal. Defaults to 5 percent.
keepOutliers
Should outlier-driven gap signals be kept in the results? Defaults to FALSE
outCutoff
Value used to identify gap signals driven by outliers. Defined as the percentage of the total sample size; the sum of samples in all groups except the largest must exceed this number of samples in order for the probe to still be considered a gap signal. Defaults to 1 percent.
verbose
logical value. If TRUE, it writes some messages indicating progress. If FALSE nothing should be printed.

Value

A list with three values,
proberesults
A data frame listing, for each identified gap signal, the number of groups and the size of each group.
sampleresults
a matrix of dimemsions probes (rows) by samples (columns). Individuals are assigned numbers based onthe groups into which they cluster. Lower number groups indicate lower mean methylation values for the group. For example, individuals coded as ‘1’ will have a lower mean methylation value than those individuals coded as ‘2’.
algorithm
A list detailing the arguments supplied to the function.

Details

The function can calculate a beta matrix or utilize a user-supplied matrix of beta values.

The function will idenfity probes with a gap in a beta signal greater than or equal to the defined threshold. These probes constitue an additional, dataset-specific subset of probes that merit special consideration due to their tendency to be driven by an underlying SNP or other genetic variant. In this manner, these probes can serve as surrogates for underlying genetic signal locally and/or in a broader (i.e. haplotype) context. Please see our upcoming manuscript for a detailed description of the utility of these probes. Outlier-driven gap signals are those in which the sum of the smaller group(s) does not exceed a certain percentage of the sample size, defined by the argument outCutoff.

References

SV Andrews, C Ladd-Acosta, KD Hansen, AP Feinberg, MD Fallin. "Gap hunting" to identify multimodal distributions of DNA methylation. Manuscript in preparation.

Examples

Run this code
if(require(minfiData)) {
  gapres <- gaphunter(MsetEx, threshold=0.3, keepOutliers=TRUE)
  #Note: the threshold argument is increased from the default value in this small example
  #dataset with 6 people to avoid the reporting of a large amount of probes as gap signals.
  #In a typical EWAS setting with hundreds of samples, the default arguments should be
  #sufficient.
}

Run the code above in your browser using DataLab