estimateCellCounts(rgSet, compositeCellType = "Blood", processMethod = "auto", probeSelect = "auto", cellTypes = c("CD8T","CD4T", "NK","Bcell","Mono","Gran"), returnAll = FALSE, meanPlot = FALSE, verbose = TRUE, ...)
RGChannelSet
for the procedure.preprocessQuantile
for
Blood and DLPFC and preprocessNoob
otherwise, in line
with the existing literature. Set it to the name of a
preprocessing function as a character if you want to override
it, like "preprocessFunnorm"
.preprocessQuantile
.returnAll=TRUE
a list of a count matrix (see previous
paragraph), a composition table and the normalized user data in form
of a GenomicMethylSet.
This is an implementaion of the Houseman et al (2012) regression calibration approachalgorithm
to the Illumina 450k microarray for deconvoluting heterogeneous tissue sources like blood.
For example, this function will take an RGChannelSet
from a DNA methylation (DNAm)
study of blood, and return the relative proportions of CD4+ and CD8+ T-cells, natural
killer cells, monocytes, granulocytes, and b-cells in each sample.
The function currently supports cell composition estimation for blood, cord blood, and
the frontal cortex, through compositeCellType
values of "Blood", "CordBlood", and
"DLPFC", respectively. Packages containing the appropriate reference data should be installed
before running the function for the first time ("FlowSorted.Blood.450k", "FlowSorted.DLPFC.450k",
"FlowSorted.CordBlood.450k"). Each tissue supports the estimation of different cell types, delimited
via the cellTypes
argument. For blood, these are "Bcell", "CD4T", "CD8T", "Eos", "Gran",
"Mono", "Neu", and "NK" (though the default value for cellTypes
is often sufficient).
For cord blood, these are "Bcell", "CD4T", "CD8T", "Gran", "Mono", "Neu", and "nRBC". For frontal
cortex, these are "NeuN_neg" and "NeuN_pos". See documentation of individual reference packages for
more details.
The meanPlot
should be used to check for large batch effects in the data,
reducing the confidence placed in the composition estimates. This plot
depicts the average DNA methylation across the cell-type discrimating probes
in both the provided and sorted data. The means from the provided
heterogeneous samples should be within the range of the sorted samples.
If the sample means fall outside the range of the sorted means,
the cell type estimates will inflated to the closest cell type. Note that we
quantile normalize the sorted data with the provided data to reduce these
batch effects.
preprocessQuantile
## Not run:
# if(require(FlowSorted.Blood.450k)) {
# wh.WBC <- which(FlowSorted.Blood.450k$CellType == "WBC")
# wh.PBMC <- which(FlowSorted.Blood.450k$CellType == "PBMC")
# RGset <- FlowSorted.Blood.450k[, c(wh.WBC, wh.PBMC)]
# ## The following line is purely to work around an issue with repeated
# ## sampleNames and Biobase::combine()
# sampleNames(RGset) <- paste(RGset$CellType,
# c(seq(along = wh.WBC), seq(along = wh.PBMC)), sep = "_")
# counts <- estimateCellCounts(RGset, meanPlot = FALSE)
# round(counts, 2)
# }
# ## End(Not run)
Run the code above in your browser using DataLab