Learn R Programming

MatrixCorrelation (version 0.10.0)

SMI: Similarity of Matrices Index (SMI)

Description

A similarity index for comparing coupled data matrices.

Usage

SMI(
  X1,
  X2,
  ncomp1 = Rank(X1) - 1,
  ncomp2 = Rank(X2) - 1,
  projection = "Orthogonal",
  Scores1 = NULL,
  Scores2 = NULL,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

Arguments

X1

first matrix to be compared (data.frames are also accepted).

X2

second matrix to be compared (data.frames are also accepted).

ncomp1

maximum number of subspace components from the first matrix.

ncomp2

maximum number of subspace components from the second matrix.

projection

type of projection to apply, defaults to "Orthogonal", alternatively "Procrustes".

Scores1

user supplied score-matrix to replace singular value decomposition of first matrix.

Scores2

user supplied score-matrix to replace singular value decomposition of second matrix.

impute

logical for activation of PCA based imputation for X1/X2.

impute_par

named list of imputation parameters in case of NAs in X1/X2.

Value

A matrix containing all combinations of components. Its class is "SMI" associated with print, plot, summary methods.

Details

A two-step process starts with extraction of stable subspaces using Principal Component Analysis or some other method yielding two orthonormal bases. These bases are compared using Orthogonal Projection (OP / ordinary least squares) or Procrustes Rotation (PR). The result is a similarity measure that can be adjusted to various data sets and contexts and which includes explorative plotting and permutation based testing of matrix subspace equality.

References

Ulf Geir Indahl, Tormod N<U+00E6>s, Kristian Hovde Liland; 2018. A similarity index for comparing coupled matrices. Journal of Chemometrics; e3049.

See Also

plot.SMI (print.SMI/summary.SMI), RV (RV2/RVadj), r1 (r2/r3/r4/GCD), Rozeboom, Coxhead, allCorrelations (matrix correlation comparison), PCAcv (cross-validated PCA), PCAimpute (PCA based imputation).

Examples

Run this code
# NOT RUN {
# Simulation
X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

(smi <- SMI(X1,X2,5,5))
plot(smi, B = 1000 ) # default B = 10000

# Sensory analysis
data(candy)
plot( SMI(candy$Panel1, candy$Panel2, 3,3, projection = "Procrustes"),
    frame = c(2,2), B = 1000, x1lab = "Panel1", x2lab = "Panel2" ) # default B = 10000

# Missing data (100 missing completely at random points each)
X1[sort(round(runif(100)*29999+1))] <- NA
X2[sort(round(runif(100)*29999+1))] <- NA
(smi <- SMI(X1,X2,5,5, impute = TRUE))

# }

Run the code above in your browser using DataLab