network: Relevance Network for (r)CCA and (s)PLS regression

Description

Display relevance associations network for (regularized) canonical correlation analysis and (sparse) PLS regression.

Usage

network(mat,
comp = NULL,
blocks = c(1,2),
threshold = NULL,
row.names = TRUE,
col.names = TRUE,
block.var.names = TRUE,
color.node = NULL,
shape.node = NULL,
cex.node.name = 1,
color.edge = color.GreenRed(100),
lty.edge = "solid",
lwd.edge = 1,
show.edge.labels = FALSE,
cex.edge.label = 1,
show.color.key = TRUE,
symkey = TRUE,
keysize = c(1, 1),
breaks,
interactive = FALSE,
layout.fun = NULL,
save = NULL,
name.save = NULL)

Arguments

mat

numeric matrix of values to be represented.

comp

atomic or vector of positive integers. The components to adequately account for the data association. Defaults to comp = 1.

threshold

numeric value between 0 and 1. The tuning threshold for the relevant associations network (see Details).

row.names, col.names

character vector containing the names of $X$- and $Y$-variables.

color.node

vector of length two, the colors of the $X$ and $Y$ nodes (see Details).

shape.node

character vector of length two, the shape of the $X$ and $Y$ nodes (see Details).

color.edge

vector of colors or character string specifying the colors function to using to color the edges, set to default to color.GreenRed(100) but other palettes can be chosen (see Details and Examples).

lty.edge

character vector of length two, the line type for the edges (see Details).

lwd.edge

vector of length two, the line width of the edges (see Details).

show.edge.labels

logical. If TRUE, plot association values as edge labels (defaults to FALSE).

show.color.key

boolean. If TRUE a color key should be plotted.

symkey

boolean indicating whether the color key should be made symmetric about 0. Defaults to TRUE.

keysize

numeric value indicating the size of the color key.

breaks

(optional) either a numeric vector indicating the splitting points for binning mat into colors, or a integer number of break points to be used, in which case the break points will be spaced equally between min(mat) and max(

interactive

logical. If TRUE, a scrollbar is created to change the threshold value interactively (defaults to FALSE). See Details.

save

should the plot be saved ? If so, argument to be set either to 'jpeg', 'tiff', 'png' or 'pdf'.

name.save

character string giving the name of the saved file.

cex.edge.label

the font size for the edge labels.

cex.node.name

the font size for the node labels.

blocks

a vector indicating the block variables to display.

block.var.names

either a list of vector components for variable names in each block or FALSE for no names. If TRUE, the columns names of the blocks are used as names.

layout.fun

a function. It specifies how the vertices will be placed on the graph. See help(layout) in the igraph package. Defaults to layout.fruchterman.reingold.

Value

network return a list containing the following components:
Mthe correlation matrix used by network.
gRa graph object (see the igraph package).

encoding

latin1

Warning

If the number of variables is high, the generation of the network can take some seconds.

Details

network allows to infer large-scale association networks between the $X$ and $Y$ datasets in rcc or spls. The output is a graph where each $X$- and $Y$-variable corresponds to a node and the edges included in the graph portray associations between them.

In rcc, to identify $X$-$Y$ pairs showing relevant associations, network calculate a similarity measure between $X$ and $Y$ variables in a pair-wise manner: the scalar product value between every pairs of vectors in dimension length(comp) representing the variables $X$ and $Y$ on the axis defined by $Z_i$ with $i$ in comp, where $Z_i$ is the equiangular vector between the $i$-th $X$ and $Y$ canonical variate.

In spls, if object$mode is regression, the similarity measure between $X$ and $Y$ variables is given by the scalar product value between every pairs of vectors in dimension length(comp) representing the variables $X$ and $Y$ on the axis defined by $U_i$ with $i$ in comp, where $U_i$ is the $i$-th $X$ variate. If object$mode is canonical then $X$ and $Y$ are represented on the axis defined by $U_i$ and $V_i$ respectively.

Variable pairs with a high similarity measure (in absolute value) are considered as relevant. By changing the threshold, one can tune the relevance of the associations to include or exclude relationships in the network.

interactive=TRUE open two device, one for association network, one for scrollbar, and define an interactive process: by clicking either at each end (`$-$' or `$+$') of the scrollbar or at middle portion of this. The position of the slider indicate which is the `threshold' value associated to the display network.

The interactive process is terminated by clicking the second button and selecting `Stop' from the menu, or from the `Stop' menu on the graphics window.

The color.node is a vector of length two, of any of the three kind of R colors, i.e., either a color name (an element of colors()), a hexadecimal string of the form "#rrggbb", or an integer i meaning palette()[i]. color.node[1] and color.node[2] give the color for filled nodes of the $X$- and $Y$-variables respectively. Defaults to c("white", "white").

color.edge give the color to edges with colors corresponding to the values in mat. Defaults to color.GreenRed(100) for negative (green) and positive (red) correlations. We also propose other palettes of colors, such as color.jet and color.spectral, see help on those functions, and examples below. Other palette of colors from the stats package can be used too.

shape.node[1] and shape.node[2] provide the shape of the nodes associate to $X$- and $Y$-variables respectively. Current acceptable values are "circle" and "rectangle". Defaults to c("circle", "rectangle").

lty.edge[1] and lty.egde[2] give the line type to edges with positive and negative weight respectively. Can be one of "solid", "dashed", "dotted", "dotdash", "longdash" and "twodash". Defaults to c("solid", "solid").

lwd.edge[1] and lwd.edge[2] provide the line width to edges with positive and negative weight respectively. This attribute is of type double with a default of c(1, 1).

References

Gonzalez I., Le Cao K-A., Davis, M.J. and Dejean, S. (2012). Visualising associations between paired ?omics? data sets. J. Data Mining 5:19. http://www.biodatamining.org/content/5/1/19/abstract

Butte, A. J., Tamayo, P., Slonim, D., Golub, T. R. and Kohane, I. S. (2000). Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proceedings of the National Academy of Sciences of the USA 97, 12182-12186.

Moriyama, M., Hoshida, Y., Otsuka, M., Nishimura, S., Kato, N., Goto, T., Taniguchi, H., Shiratori, Y., Seki, N. and Omata, M. (2003). Relevance Network between Chemosensitivity and Transcriptome in Human Hepatoma Cells. Molecular Cancer Therapeutics 2, 199-205.

Examples

Run this code

## network representation for objects of class 'rcc'
data(nutrimouse)
X <- nutrimouse$lipid
Y <- nutrimouse$gene
nutri.res <- rcc(X, Y, ncomp = 3, lambda1 = 0.064, lambda2 = 0.008)

# may not work on the Linux version, use Windows instead
# sometimes with Rstudio might not work because of margin issues,
# in that case save it as an image
jpeg('example1-network.jpeg', res = 600, width = 4000, height = 4000)
network(nutri.res, comp = 1:3, threshold = 0.6)
dev.off()

## Changing the attributes of the network
# sometimes with Rstudio might not work because of margin issues,
# in that case save it as an image
jpeg('example2-network.jpeg')
network(nutri.res, comp = 1:3, threshold = 0.45,
color.node = c("mistyrose", "lightcyan"),
shape.node = c("circle", "rectangle"),
color.edge = color.jet(100),
lty.edge = "solid", lwd.edge = 2,
show.edge.labels = FALSE)
dev.off()

## interactive 'threshold'
network(nutri.res, comp = 1:3, threshold = 0.55, interactive = TRUE)
## select the 'threshold' and "see" the new network

## network representation for objects of class 'spls'
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic
toxicity.spls <- spls(X, Y, ncomp = 3, keepX = c(50, 50, 50),
keepY = c(10, 10, 10))
# sometimes with Rstudio might not work because of margin issues,
# in that case save it as an image
jpeg('example3-network.jpeg')
network(toxicity.spls, comp = 1:3, threshold = 0.8,
color.node = c("mistyrose", "lightcyan"),
shape.node = c("rectangle", "circle"),
color.edge = color.spectral(100),
lty.edge = "solid", lwd.edge =  1,
show.edge.labels = FALSE, interactive = FALSE)
dev.off()

Run the code above in your browser using DataLab