targetG:
Computation of target G ('knowledge-based constant correlation model').
Description
The $p x p$ target G is computed from the
$n x p$ data matrix.
It is defined as follows ($i,j = 1,...,p$):
$$t_{ij}=\left\{
\begin {array} {ll}
s_{ii}\;&\mbox{if}\;i=j\\
\bar{r}\sqrt{s_{ii}s_{jj}}\;&\mbox{if}\;i\neq j, i\sim j\\
0\;&\mbox{otherwise}
\end{array}
\right.$$ where $r$
is the average of sample correlations and $sij$ denotes the entry of
the unbiased covariance matrix in row $i$, column $j$. The
notation $i ~ j$ means that genes
$i$ and $j$ are connected, i.e. genes $i$ and $j$
are in the same gene functional group.
Usage
targetG(x, genegroups)
Arguments
x
A $n x p$ data matrix.
genegroups
A list of genes obtained using the database KEGG, where each entry itself is a list of pathway names
this genes belongs to. If a gene does not belong to any gene functional group, the entry is NA.
Value
A $p x p$ matrix.
References
J. Schaefer and K. Strimmer, 2005. A shrinkage approach to large-scale
covariance matrix estimation and implications for functional genomics.
Statist. Appl. Genet. Mol. Biol. 4:32.
M. Jelizarow, V. Guillemot, A. Tenenhaus, K. Strimmer, A.-L. Boulesteix, 2010.
Over-optimism in bioinformatics: an illustration. Bioinformatics. Accepted.
# A short example on a toy dataset# require(SHIP)data(expl)
attach(expl)
tar <- targetG(x,genegroups)
which(tar[upper.tri(tar)]!=0) # not many non zero coefficients !