the percentile value of the emperical distribution of PVAC
scores of a set of ``non-expressed'' genes. Used to select the
filtering threshold. The default value is 0.99.
Value
A list with the following components,
aset
Names of the probesets that have passed the filter
nullset
Names of the presumably ``non-expressed'' probesets
(those with absent calls across all the study samples)
pvac
A named vector containing the PVAC scores of all probesets
cutoff
The PVAC cutoff value. The maximum is set to 0.5 (which
corresponds to 50% of the total variation in a probeset)
Details
This function implements a new filtering method for Affymetrix GeneChips, based
on principal component analysis (PCA) on the probe-level expression
data. Given that all the probes in a probeset are designed to target one or a common
cluster of transcripts, the measurements of probes in a probeset
should be correlated. The degree of concordance of gene expression
among probes can be approximated by the proportion of variation
accounted by the first principal component (PVAC). Using
a wholly defined spike-in dataset, we have shown that
filtering by PVAC provides increased sensitivity in detecting
truly differentially expressed genes while controlling the false
discoveries. The filtering threshold value is chosen from the PVAC
score distribution in a set of ``non-expressed'' gene (those with absent calls in all samples).