distatis
: Implements the DISTATIS
method which is a 3-way generalization of
metric multidimensional scaling
(a.k.a. classical MDS or principal coordinate analysis).
distatis(
LeCube2Distance,
Norm = "MFA",
Distance = TRUE,
double_centering = TRUE,
RV = TRUE,
nfact2keep = 3,
compact = FALSE
)
distatis
sends back the results
via two lists:
res.Cmat
and res.Splus
.
Note that items with a * are the only ones sent back
when using the compact = TRUE
option.
Results for the between distance matrices analysis.
res.Cmat$C
The \(I\times I\) C matrix
of scalar products (or \(R_V\) between distance matrices).
res.Cmat$vectors
The eigenvectors of the C matrix
res.Cmat$alpha
* The \(\alpha\) weights
res.Cmat$value
The eigenvalues of the C matrix
res.Cmat$G
The factor scores for the C matrix
res.Cmat$ctr
The contributions for res.Cmat$G
,
res.Cmat$cos2
The squared cosines for res.Cmat$G
res.Cmat$d2
The squared
Euclidean distance for res.Cmat$G
.
Results for the between observation analysis.
res.Splus$SCP
an \(I\times I\times K\) array.
Contains
the (normalized if needed)
cross product matrices corresponding to the
distance matrices.
res.Splus$Splus
* The compromise
(optimal linear
combination of the SCP's').
res.Splus$eigValues
*
The eigenvalues of the compromise).
res.Splus$eigVectors
*
The eigenvectors of the compromise).
res.Splus$tau
* The percentage
of explained inertia of the eigenValues).
res.Splus$ProjectionMatrix
The
projection matrix used to compute factor
scores and partial factor scores.
res.Splus$F
The factor scores for the observations.
res.Splus$ctr
The contributions for res.Cmat$F
.
res.Splus$cos2
The squared cosines for res.Cmat$F
.
res.Splust$d2
The squared
Euclidean distance for res.Cmat$F
.
res.Splus$PartialF
an
\(I \times \code{nf2keep} \times K\) array.
Contains the partial factors for the distance
matrices.
an "observations \(\times\) observations \(\times\) distance matrices" array of dimensions \(I\times I \times K\). Each of the \(K\) "slices" is a \(I\times I\) square distance (or covariance) matrix describing the \(I\) observations.
Type of normalization
used for each cross-product matrix derived
from the distance (or covariance) matrices.
Current options are NONE
(do nothing), SUMPCA
(normalize by the total inertia)
or MFA
(default
) that normalizes each matrix so
that its first eigenvalue is equal to one
or NUCLEAR
(i.e., the of the squarae root of the
eigenvalues).
if TRUE
(default
)
the matrices are distance matrices, FALSE
the matrices are treated as positive semi-definite matrices
(e.g., scalar products,
covariance, or correlation matrices).
if TRUE
(default
) the matrices are double-centered
(should always be used for distances).
if FALSE
the matrices
will not be double centered
(note that these matrices
should be semi positive definite matrices such as,
for example,
covariance matrices).
if TRUE
(default
)
we use the \(R_V\) coefficient to
compute the \(\alpha\),
if FALSE
we use the matrix scalar product.
(default: 3
) Number of factors
to keep for the computation of the
factor scores of the observations.
if FALSE
(default),
distatis
provides detailed output, if
TRUE
, distatis
sends back
only the \(\alpha\) weights
(this option is used to make the
bootstrap routine
BootFromCompromise
more
computationally efficient).
Hervé Abdi
#@seealso GraphDistatisAll
GraphDistatisBoot
#GraphDistatisCompromise
# GraphDistatisPartial
#GraphDistatisRv
DistanceFromSort
#BootFactorScores
BootFromCompromise
#as help
,
distatis
takes
as input a set of \(K\) distance matrices
(or positive semi-definite matrices such as scalar products,
covariance, or correlation matrices)
describing a set of \(I\) observations.
From this set of matrices distatis
computes: (1) a set of
factor scores that describes the similarity structure
of the \(K\) distance
matrices (e.g., what distance matrices describe the
observations in the same
way, what distance matrices differ from each other)
(2) a set of factor
scores (called the compromise factor scores)
that best describes
the similarity structure of the \(I\) observations
and (3)
\(I\)
sets of
partial factor scores that show how
each individual distance matrix "sees"
the compromise space.
distatis
computes the compromise as an optimum
linear combination of the cross-product matrices
associated to each distance
(or positive positive semi-definite)
matrix.
distatis
can also be applied to a set of
scalar products, covariance, or correlation
matrices.
DISTATIS is part of the STATIS family. It is often used to analyze the results of sorting tasks.
Abdi, H., Valentin, D., O'Toole, A.J., & Edelman, B. (2005). DISTATIS: The analysis of multiple distance matrices. Proceedings of the IEEE Computer Society: International Conference on Computer Vision and Pattern Recognition. (San Diego, CA, USA). pp. 42--47.
Abdi, H., Valentin, D., Chollet, S., & Chrea, C. (2007). Analyzing assessors and products in sorting tasks: DISTATIS, theory and applications. Food Quality and Preference, 18, 627--640.
Abdi, H., Dunlop, J.P., & Williams, L.J. (2009). How to compute reliability estimates and display confidence and tolerance intervals for pattern classifiers using the Bootstrap and 3-way multidimensional scaling (DISTATIS). NeuroImage, 45, 89--95.
Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012). STATIS and DISTATIS: Optimum multi-table principal component analysis and three way metric multidimensional scaling. Wiley Interdisciplinary Reviews: Computational Statistics, 4, 124--167.
The \(R_V\) coefficient is described in
Abdi, H. (2007). RV coefficient and congruence coefficient. In N.J. Salkind (Ed.): Encyclopedia of Measurement and Statistics. Thousand Oaks (CA): Sage. pp. 849--853.
Abdi, H. (2010). Congruence: Congruence coefficient, RV coefficient, and Mantel Coefficient. In N.J. Salkind, D.M., Dougherty, & B. Frey (Eds.): Encyclopedia of Research Design. Thousand Oaks (CA): Sage. pp. 222--229.
(These papers are available from https://personal.utdallas.edu/~herve/)
# 1. Load the DistAlgo data set
# (available from the DistatisR package).
data(DistAlgo)
# DistAlgo is a 6*6*4 Array (face*face*Algorithm)
#------------------------------------------------------------------
# 2. Call the DISTATIS routine with the array
# of distance (DistAlgo) as parameter
DistatisAlgo <- distatis(DistAlgo)
Run the code above in your browser using DataLab