Usage
dexss(X, nclasses = 2, G = 1, alphaInit, cyc = 20,
labels, normalization = "RLE", kmeansIter = 10,
ignoreIfAllCountsSmaller = 1, theta = 2.5, minMu = 0.5,
rmax = 13, initialization = "kmeans",
multiclassPhiPoolingFunction = NULL, quiet = FALSE,
resultObject = "S4")
Arguments
X
either a vector of counts or a raw data matrix,
where columns are interpreted as samples and rows as
genomic regions. An instance of "countDataSet" is also
accepted.
nclasses
The number of conditions, i.e. mixture
components. (Default = 2)
G
The weight of the prior distribution of the
mixture weights. Not used in the supervised case.
(Default = 1).
cyc
Positive integer that sets the number of
cycles of the EM algorithm. (Default = 20).
alphaInit
The initial estimates of the condition
sizes, i.e., mixture weights. Not used in the supervised
case. (Default = c(0.5,0.5)) .
labels
The labels for the classes, will be coerced
into an integer. For this semi-supervised version the
known labels/conditions must be coded as integers
starting with 1. The samples with the label 1 will be
considered as being in the "major condition". For the
samples with unknown labels/conditions an "NA" must be
set.
normalization
method used for normalizing the
reads. "RLE" is the method used by (Anders and Huber,
2010), "upperquartile" is the Upper-Quartile method by
(Bullard et al., 2010), and none deactivates
normalization. (Default = "RLE").
kmeansIter
number of times the K-Means algorithm
is run. (Default = 10).
ignoreIfAllCountsSmaller
Ignores transcript for
which all read counts are smaller than this value. These
transcripts are considered as "not expressed" (Default =
1).
theta
The weight of the prior on the size
parameter or inverse dispersion parameter. Theta is
adjusted to each transcript by dividing by the mean read
count of the transcript. The higher theta
, the
lower r
and the higher the overdispersion will be.
(Default = 2.5).
minMu
Minimal mean for all negative binomial
distributions. (Default = 0.5).
rmax
Maximal value for the size parameter. The
inverse of this parameter is the lower bound on the
dispersion. In analogy to (Anders and Huber, 2010) we use
13 as default. (Default = 13).
initialization
Method used to find the initial
clusters. Dexus can either use the quantiles of the
readcounts of each gene or run k-means on the counts.
(Default = "kmeans").
multiclassPhiPoolingFunction
In "multiClass" mode
the dispersion is either estimated across all classes at
once (NULL), or separately for each condition, i.e.,
class. The size parameters or dispersion per class are
then joined to one estimate by the mean ("mean"), minimum
("min") or maximum ("max"). In our investigations
estimation across all classes at once performed best.
(Default = NULL).
quiet
Logical that indicates whether dexus should
report the steps of the algorithm. Supresses messages
from the program if set to TRUE. (Default = FALSE).
resultObject
Type of the result object; can either
be a list ("list") or an instance of "DEXUSResult"
("S4"). (Default="S4").