Usage
dcRWRpredict(data, g, output.file = NULL, ontology = c(NA, "GOBP",
"GOMF",
"GOCC", "DO", "HPPA", "HPMI", "HPON", "MP", "EC", "KW", "UP"),
method = c("indirect", "direct"), normalise = c("laplacian", "row",
"column", "none"), restart = 0.75, normalise.affinity.matrix =
c("none",
"quantile"), leave.one.out = T, propagation = c("max", "sum"),
scale.method = c("log", "linear", "none"), parallel = TRUE,
multicores = NULL, verbose = T, RData.ontology.customised = NULL,
RData.location =
"https://github.com/hfang-bristol/RDataCentre/blob/master/dcGOR")
Arguments
data
an input gene-term data matrix containing known annotations
used for seeds. Each value in input matrix does not necessarily have to
be binary (non-zeros will be used as a weight, but should be
non-negative for easy interpretation). Also, data can be a list, each
containing the known annotated genes
output.file
an output file containing predicted results. If not
NULL, a tab-delimited text file will be also written out; otherwise,
there is no output file (by default)
ontology
the ontology identity. It can be "GOBP" for Gene
Ontology Biological Process, "GOMF" for Gene Ontology Molecular
Function, "GOCC" for Gene Ontology Cellular Component, "DO" for Disease
Ontology, "HPPA" for Human Phenotype Phenotypic Abnormality, "HPMI" for
Human Phenotype Mode of Inheritance, "HPON" for Human Phenotype ONset
and clinical course, "MP" for Mammalian Phenotype, "EC" for Enzyme
Commission, "KW" for UniProtKB KeyWords, "UP" for UniProtKB UniPathway.
For details on the eligibility for pairs of input domain and ontology,
please refer to the online Documentations at
http://supfam.org/dcGOR/docs.html. If NA, then the user has to
input a customised RData-formatted file (see
RData.ontology.customised
below) method
the method used to calculate RWR. It can be 'direct' for
directly applying RWR, 'indirect' for indirectly applying RWR (first
pre-compute affinity matrix and then derive the affinity score)
normalise
the way to normalise the adjacency matrix of the input
graph. It can be 'laplacian' for laplacian normalisation, 'row' for
row-wise normalisation, 'column' for column-wise normalisation, or
'none'
restart
the restart probability used for RWR. The restart
probability takes the value from 0 to 1, controlling the range from the
starting nodes/seeds that the walker will explore. The higher the
value, the more likely the walker is to visit the nodes centered on the
starting nodes. At the extreme when the restart probability is zero,
the walker moves freely to the neighbors at each step without
restarting from seeds, i.e., following a random walk (RW)
normalise.affinity.matrix
the way to normalise the output
affinity matrix. It can be 'none' for no normalisation, 'quantile' for
quantile normalisation to ensure that columns (if multiple) of the
output affinity matrix have the same quantiles
leave.one.out
logical to indicate whether the leave-one-out test
is used for predictions. By default, it sets to true for doing
leave-one-out test (that is, known seeds are removed)
propagation
how to propagate the score. It can be "max" for
retaining the maximum score (by default), "sum" for additively
accumulating the score
scale.method
the method used to scale the predictive scores. It
can be: "none" for no scaling, "linear" for being linearily scaled into
the range between 0 and 1, "log" for the same as "linear" but being
first log-transformed before being scaled. The scaling between 0 and 1
is done via: $\frac{S - S_{min}}{S_{max} - S_{min}}$, where
$S_{min}$ and $S_{max}$ are the minimum and maximum values for
$S$
parallel
logical to indicate whether parallel computation with
multicores is used. By default, it sets to true, but not necessarily
does so. Partly because parallel backends available will be
system-specific (now only Linux or Mac OS). Also, it will depend on
whether these two packages "foreach" and "doMC" have been installed. It
can be installed via:
source("http://bioconductor.org/biocLite.R");
biocLite(c("foreach","doMC"))
. If not yet installed, this option will
be disabled
multicores
an integer to specify how many cores will be
registered as the multicore parallel backend to the 'foreach' package.
If NULL, it will use a half of cores available in a user's computer.
This option only works when parallel computation is enabled
verbose
logical to indicate whether the messages will be
displayed in the screen. By default, it sets to true for display
RData.ontology.customised
a file name for RData-formatted file
containing an object of S4 class 'Onto' (i.g. ontology). By default, it
is NULL. It is only needed when the user wants to perform customised
analysis using their own ontology. See dcBuildOnto
for
how to creat this object RData.location
the characters to tell the location of built-in
RData files. See dcRDataLoader
for details