How to measure the the correlation between two clusters or groups of variables x and y from the same data set is a recurring problem. Perhaps the most obvious is simply the unweighted correlation Ru.
Consider the matrix M composed of four submatrices
The unit weighted correlation, Ru is merely
\(Ru =\frac{\Sigma{r_{xy}}}{\sqrt{\Sigma{r_x}\Sigma{r_y}} }\)
A second is the Set correlation (also found in lmCor
) by Cohen 1982) which is
\(Rset = 1- \frac{det(m)}{det(x)* det(y)}\)
Where m is the full matrix (x+y)by (x+y). and det represents the determinant.
A third approach (the RV coeffiecent) was introduced by Escoufier (1970) and Robert and Escoufier (1976).
\(RV = \frac{tr(xy (xy)')}{\sqrt{(tr(x x') * tr(y
y'))}}\).
Where tr
is the trace operator. (The sum of the diagonals).
The analysis can be done from the raw data or from correlation or covariance matrices. From the raw data, just specify the x and y variables. If using correlation/covariance matrixes, the xy matrix must be specified as well.
If using raw data, just specify the x and y columns and the data file.