sdByScanChromWindow uses a sliding window algorithm to
calculate the standard deviation of the BAlleleFreq (or LogRRatio) values for a user
specified number of bins across each chromosome of each scan. medianSdOverAutosomes calculates the median of the
BAlleleFreq (or LogRRatio) standard deviation over all autosomes for each scan.
meanSdByChromWindow calculates the mean and standard
deviation of the BAlleleFreq standard deviation in each window in each
chromosome over all scans.
findBAFvariance flags chromosomal areas with high BAlleleFreq
standard deviation using previously calculated means and standard
deviations over scans, typically results from
sdByScanChromWindow.
sdByScanChromWindow(intenData, genoData=NULL, var="BAlleleFreq", nbins=NULL, snp.exclude=NULL, return.mean=FALSE, incl.miss=TRUE, incl.het=TRUE, incl.hom=FALSE)
medianSdOverAutosomes(sd.by.scan.chrom.window)
meanSdByChromWindow(sd.by.scan.chrom.window, sex)
findBAFvariance(sd.by.chrom.window, sd.by.scan.chrom.window, sex, sd.threshold)IntensityData object. The order of
SNPs is expected to be by chromosome and then by position within chromosome.GenotypeData object. May be omitted
if incl.miss, incl.het, and incl.hom are all
TRUE, as there is no need to distinguish between genotype calls in
that case.TRUE, return mean as well as
standard deviation.TRUE, include SNPs with missing
genotype calls.TRUE, include SNPs called as heterozygotes.TRUE, include SNPs called as
homozygotes. This is typically FALSE (the default) for
BAlleleFreq calculations.sdByScanChromWindow.meanSdByChromWindow.sdByScanChromWindow returns a list of matrices containing standard deviations.
There is a matrix for each chromosome, with each matrix having
dimensions of number of scans x number of windows. If
return.mean=TRUE, two lists to matrices are returned, one with
standard deviations and one with means.medianSdOverAutosomes returns a data frame with colums "scanID" and
"med.sd" containing the median standard deviations over all
autosomes for each scan.meanSdByChromWindow returns a list of matrices, one for
each chromosome. Each matrix contains two columns called "Mean" and
"SD", containing the mean and SD of the BAlleleFreq standard devations
over scans for each bin. For the X chromosome the matrix has four
columns "Female Mean", "Male Mean", "Female SD" and "Male SD".findBAFvariance returns a matrix with columns "scanID",
"chromosome", "bin", and "sex" containing those scan by chromosome
combinations with BAlleleFreq standard deviations greater than those
specified by sd.threshold.
sdByScanChromWindow calculates the standard deviation of
BAlleleFreq (or LogRRatio) values across chromosomes 1-22 and chromosome X for a
specified number of 'bins' in each chromosome as passed to the
function in the 'nbins' argument. The standard deviation is
calculated using windows of width equal to 2 bins, and moves along the
chromosome by an offset of 1 bin (or half a window). Thus, there will
be a total of nbins-1 windows per chromosome. If
nbins=NULL (the default), there will be 2 bins (one window) for
each chromosome. medianSdOverAutosomes calulates the median over autosomes of
BAlleleFreq (or LogRRatio) standard deviations calculated
for sliding windows within each chromosome of each scan. The
standard deviations should be a list with one element for
each chromosome, and each element consisting of a matrix with scans as rows.
meanSdByChromWindow calculates the mean and standard
deviation over scans of BAlleleFreq standard deviations calculated
for sliding windows within each chromosome of each scan. The
BAlleleFreq standard deviations should be a list with one element for
each chromosome, and each element consisting of a matrix containing
the BAlleleFreq standard deviation for the i'th scan in the j'th
bin. This is typically created using the
sdByScanChromWindow function. For the X chromosome the
calculations are separated out by sex.
findBAFvariance determines which chromosomes of which scans
have regions which are at least a given number of SDs from the mean,
using BAlleleFreq means and standard deviations calculated from
sliding windows over each chromosome by scan.
IntensityData, GenotypeData,
BAFfromClusterMeans, BAFfromGenotypes
library(GWASdata)
data(illuminaScanADF)
blfile <- system.file("extdata", "illumina_bl.gds", package="GWASdata")
bl <- GdsIntensityReader(blfile)
blData <- IntensityData(bl, scanAnnot=illuminaScanADF)
genofile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
geno <- GdsGenotypeReader(genofile)
genoData <- GenotypeData(geno, scanAnnot=illuminaScanADF)
nbins <- rep(8, 3) # need bins for chromosomes 21,22,23
baf.sd <- sdByScanChromWindow(blData, genoData, nbins=nbins)
close(blData)
close(genoData)
med.res <- medianSdOverAutosomes(baf.sd)
sex <- illuminaScanADF$sex
sd.res <- meanSdByChromWindow(baf.sd, sex)
var.res <- findBAFvariance(sd.res, baf.sd, sex, sd.threshold=2)
Run the code above in your browser using DataLab