sdByScanChromWindow
uses a sliding window algorithm to
calculate the standard deviation of the BAlleleFreq (or LogRRatio) values for a user
specified number of bins across each chromosome of each scan. medianSdOverAutosomes
calculates the median of the
BAlleleFreq (or LogRRatio) standard deviation over all autosomes for each scan.
meanSdByChromWindow
calculates the mean and standard
deviation of the BAlleleFreq standard deviation in each window in each
chromosome over all scans.
findBAFvariance
flags chromosomal areas with high BAlleleFreq
standard deviation using previously calculated means and standard
deviations over scans, typically results from
sdByScanChromWindow
.
sdByScanChromWindow(intenData, genoData=NULL, var="BAlleleFreq", nbins=NULL, snp.exclude=NULL, return.mean=FALSE, incl.miss=TRUE, incl.het=TRUE, incl.hom=FALSE)
medianSdOverAutosomes(sd.by.scan.chrom.window)
meanSdByChromWindow(sd.by.scan.chrom.window, sex)
findBAFvariance(sd.by.chrom.window, sd.by.scan.chrom.window, sex, sd.threshold)
IntensityData
object. The order of
SNPs is expected to be by chromosome and then by position within chromosome.GenotypeData
object. May be omitted
if incl.miss
, incl.het
, and incl.hom
are all
TRUE
, as there is no need to distinguish between genotype calls in
that case.TRUE
, return mean as well as
standard deviation.TRUE
, include SNPs with missing
genotype calls.TRUE
, include SNPs called as heterozygotes.TRUE
, include SNPs called as
homozygotes. This is typically FALSE
(the default) for
BAlleleFreq calculations.sdByScanChromWindow
.meanSdByChromWindow
.sdByScanChromWindow
returns a list of matrices containing standard deviations.
There is a matrix for each chromosome, with each matrix having
dimensions of number of scans x number of windows. If
return.mean=TRUE
, two lists to matrices are returned, one with
standard deviations and one with means.medianSdOverAutosomes
returns a data frame with colums "scanID" and
"med.sd" containing the median standard deviations over all
autosomes for each scan.meanSdByChromWindow
returns a list of matrices, one for
each chromosome. Each matrix contains two columns called "Mean" and
"SD", containing the mean and SD of the BAlleleFreq standard devations
over scans for each bin. For the X chromosome the matrix has four
columns "Female Mean", "Male Mean", "Female SD" and "Male SD".findBAFvariance
returns a matrix with columns "scanID",
"chromosome", "bin", and "sex" containing those scan by chromosome
combinations with BAlleleFreq standard deviations greater than those
specified by sd.threshold
.
sdByScanChromWindow
calculates the standard deviation of
BAlleleFreq (or LogRRatio) values across chromosomes 1-22 and chromosome X for a
specified number of 'bins' in each chromosome as passed to the
function in the 'nbins' argument. The standard deviation is
calculated using windows of width equal to 2 bins, and moves along the
chromosome by an offset of 1 bin (or half a window). Thus, there will
be a total of nbins-1
windows per chromosome. If
nbins=NULL
(the default), there will be 2 bins (one window) for
each chromosome. medianSdOverAutosomes
calulates the median over autosomes of
BAlleleFreq (or LogRRatio) standard deviations calculated
for sliding windows within each chromosome of each scan. The
standard deviations should be a list with one element for
each chromosome, and each element consisting of a matrix with scans as rows.
meanSdByChromWindow
calculates the mean and standard
deviation over scans of BAlleleFreq standard deviations calculated
for sliding windows within each chromosome of each scan. The
BAlleleFreq standard deviations should be a list with one element for
each chromosome, and each element consisting of a matrix containing
the BAlleleFreq standard deviation for the i'th scan in the j'th
bin. This is typically created using the
sdByScanChromWindow
function. For the X chromosome the
calculations are separated out by sex.
findBAFvariance
determines which chromosomes of which scans
have regions which are at least a given number of SDs from the mean,
using BAlleleFreq means and standard deviations calculated from
sliding windows over each chromosome by scan.
IntensityData
, GenotypeData
,
BAFfromClusterMeans
, BAFfromGenotypes
library(GWASdata)
data(illuminaScanADF)
blfile <- system.file("extdata", "illumina_bl.gds", package="GWASdata")
bl <- GdsIntensityReader(blfile)
blData <- IntensityData(bl, scanAnnot=illuminaScanADF)
genofile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
geno <- GdsGenotypeReader(genofile)
genoData <- GenotypeData(geno, scanAnnot=illuminaScanADF)
nbins <- rep(8, 3) # need bins for chromosomes 21,22,23
baf.sd <- sdByScanChromWindow(blData, genoData, nbins=nbins)
close(blData)
close(genoData)
med.res <- medianSdOverAutosomes(baf.sd)
sex <- illuminaScanADF$sex
sd.res <- meanSdByChromWindow(baf.sd, sex)
var.res <- findBAFvariance(sd.res, baf.sd, sex, sd.threshold=2)
Run the code above in your browser using DataLab