ccaProj: (Robust) CCA via projections through the data points

Description

Perform canoncial correlation analysis via projection pursuit based on projections through the data points, with a focus on robust and nonparametric methods.

Usage

ccaProj(
  x,
  y,
  k = 1,
  method = c("spearman", "kendall", "quadrant", "M", "pearson"),
  control = list(...),
  standardize = TRUE,
  useL1Median = TRUE,
  fallback = FALSE,
  ...
)
CCAproj(
  x,
  y,
  k = 1,
  method = c("spearman", "kendall", "quadrant", "M", "pearson"),
  standardize = TRUE,
  useL1Median = TRUE,
  fallback = FALSE,
  ...
)

Value

An object of class "cca" with the following components:

cor: a numeric vector giving the canonical correlation measures.
A: a numeric matrix in which the columns contain the canonical vectors for x.
B: a numeric matrix in which the columns contain the canonical vectors for y.
centerX: a numeric vector giving the center estimates used in standardization of x.
centerY: a numeric vector giving the center estimates used in standardization of y.
scaleX: a numeric vector giving the scale estimates used in standardization of x.
scaleY: a numeric vector giving the scale estimates used in standardization of y.
call: the matched function call.

Arguments

x, y: each can be a numeric vector, matrix or data frame.
k: an integer giving the number of canonical variables to compute.
method: a character string specifying the correlation functional to maximize. Possible values are "spearman" for the Spearman correlation, "kendall" for the Kendall correlation, "quadrant" for the quadrant correlation, "M" for the correlation based on a bivariate M-estimator of location and scatter with a Huber loss function, or "pearson" for the classical Pearson correlation (see corFunctions).
control: a list of additional arguments to be passed to the specified correlation functional. If supplied, this takes precedence over additional arguments supplied via the ... argument.
standardize: a logical indicating whether the data should be (robustly) standardized.
useL1Median: a logical indicating whether the \(L_{1}\) medians should be used as the centers of the data sets in standardization (defaults to TRUE). If FALSE, the columnwise centers are used instead (columnwise means if method is "pearson" and columnwise medians otherwise).
fallback: logical indicating whether a fallback mode for robust standardization should be used. If a correlation functional other than the Pearson correlation is maximized, the first attempt for standardizing the data is via median and MAD. In the fallback mode, variables whose MADs are zero (e.g., dummy variables) are standardized via mean and standard deviation. Note that if the Pearson correlation is maximized, standardization is always done via mean and standard deviation.
...: additional arguments to be passed to the specified correlation functional. Currently, this is only relevant for the M-estimator. For Spearman, Kendall and quadrant correlation, consistency at the normal model is always forced.

Author

Andreas Alfons

Details

First the candidate projection directions are defined for each data set from the respective center through each data point. Then the algorithm scans all \(n^2\) possible combinations for the maximum correlation, where \(n\) is the number of observations.

For higher order canonical correlations, the data are first transformed into suitable subspaces. Then the alternate grid algorithm is applied to the reduced data and the results are back-transformed to the original space.

Examples

Run this code

data("diabetes")
x <- diabetes$x
y <- diabetes$y

## Spearman correlation
ccaProj(x, y, method = "spearman")

## Pearson correlation
ccaProj(x, y, method = "pearson")

Run the code above in your browser using DataLab