targetG: Computation of target G ('knowledge-based constant correlation model').
Description
The \(p \times p\) target G is computed from the \(n \times p\) data matrix. It is defined as follows (\(i,j = 1,...,p\)):
$$t_{ij} =
\begin{cases}
s_{ii} & \text{ if } i=j\\
\bar{r}\sqrt{s_{ii}s_{jj}} & \text{ if } i\neq j, i\sim j
\end{cases}$$
where \(\bar{r}\)
is the average of sample correlations and \(s_{ij}\) denotes the
entry of the unbiased covariance matrix in row \(i\), column
\(j\). The notation \(i\sim j\) means that genes \(i\)
and \(j\) are connected, i.e. genes \(i\) and \(j\) are in
the same gene functional group.
Usage
targetG(x, genegroups)
Value
A \(p \times p\) matrix.
Arguments
x
A \(n \times p\) data matrix.
genegroups
A list of genes obtained using the database KEGG, where
each entry itself is a list of pathway names this genes belongs to. If a
gene does not belong to any gene functional group, the entry is NA.
Author
Monika Jelizarow and Vincent Guillemot
References
J. Schaefer and K. Strimmer, 2005. A shrinkage
approach to large-scale covariance matrix estimation and implications for
functional genomics. Statist. Appl. Genet. Mol. Biol. 4:32.
M.
Jelizarow, V. Guillemot, A. Tenenhaus, K. Strimmer, A.-L. Boulesteix, 2010.
Over-optimism in bioinformatics: an illustration. Bioinformatics. Accepted.
# A short example on a toy dataset# require(SHIP)data(expl)
attach(expl)
tar <- targetG(x,genegroups)
which(tar[upper.tri(tar)]!=0) # not many non zero coefficients !