The function tests, assuming an elliptical model, that the last p-k
eigenvalues of
a scatter matrix are equal and the k
interesting components are those with a larger variance.
The scatter matrices that can be used here are the regular covariance matrix and Tyler's shape matrix.
PCAasymp(X, k, scatter = "cov", ...)
a numeric data matrix with p>1 columns.
the number of eigenvalues larger than the equal ones. Can be between 0 and p-2.
the scatter matrix to be used. Can be "cov"
or "tyler"
. For "cov"
the regular covariance matrix is computed and for
"tyler"
the function HR.Mest
is used to compute Tyler's shape matrix.
arguments passed on to HR.Mest
if scatter = "tyler"
.
A list of class ictest inheriting from class htest containing:
the value of the test statistic.
the p-value of the test.
the degrees of freedom of the test.
character string which test was performed.
character string giving the name of the data.
character string specifying the alternative hypothesis.
the number or larger eigenvalues used in the testing problem.
the transformation matrix to the principal components.
data matrix with the centered principal components.
the underlying eigenvalues.
the location of the data which was substracted before calculating the principal components.
the computed scatter matrix.
the asymptotic constant needed for the asymptotic test.
The functions assumes an elliptical model and tests if the last \(p-k\) eigenvalues of PCA are equal. PCA can here be either be based on the regular covariance matrix or on Tyler's shape matrix.
For a sample of size \(n\), the test statistic is $$T = n / (2 \bar{d}^2 \sigma_1) \sum_{k+1}^p (d_i - \bar{d})^2,$$ where \(\bar{d}\) is the mean of the last \(p-k\) PCA eigenvalues.
The constant \(\sigma_1\) is for the regular covariance matrix estimated from the data whereas for Tyler's shape matrix it is simply a function of the dimension of the data.
The test statistic has a limiting chisquare distribution with \((p-k-1)(p-k+2)/2\) degrees of freedom.
Note that the regular covariance matrix is here divided by \(n\) and not by \(n-1\).
Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.
# NOT RUN {
n <- 200
X <- cbind(rnorm(n, sd = 2), rnorm(n, sd = 1.5), rnorm(n), rnorm(n), rnorm(n))
TestCov <- PCAasymp(X, k = 2)
TestCov
TestTyler <- PCAasymp(X, k = 1, scatter = "tyler")
TestTyler
# }
Run the code above in your browser using DataLab