PCAasymp: Testing for Subsphericity using the Covariance Matrix or Tyler's Shape Matrix

Description

The function tests, assuming an elliptical model, that the last p-k eigenvalues of a scatter matrix are equal and the k interesting components are those with a larger variance. The scatter matrices that can be used here are the regular covariance matrix and Tyler's shape matrix.

Usage

PCAasymp(X, k, scatter = "cov", ...)

Arguments

a numeric data matrix with p>1 columns.

the number of eigenvalues larger than the equal ones. Can be between 0 and p-2.

scatter

the scatter matrix to be used. Can be "cov" or "tyler". For "cov" the regular covariance matrix is computed and for "tyler" the function HR.Mest is used to compute Tyler's shape matrix.

…

arguments passed on to HR.Mest if scatter = "tyler".

Value

A list of class ictest inheriting from class htest containing:

statistic

the value of the test statistic.

p.value

the p-value of the test.

parameter

the degrees of freedom of the test.

method

character string which test was performed.

data.name

character string giving the name of the data.

alternative

character string specifying the alternative hypothesis.

the number or larger eigenvalues used in the testing problem.

the transformation matrix to the principal components.

data matrix with the centered principal components.

the underlying eigenvalues.

the location of the data which was substracted before calculating the principal components.

SCATTER

the computed scatter matrix.

sigma1

the asymptotic constant needed for the asymptotic test.

Details

The functions assumes an elliptical model and tests if the last $p-k$ eigenvalues of PCA are equal. PCA can here be either be based on the regular covariance matrix or on Tyler's shape matrix.

For a sample of size $n$, the test statistic is $$T = n / (2 \bar{d}^2 \sigma_1) \sum_{k+1}^p (d_i - \bar{d})^2,$$ where $\bar{d}$ is the mean of the last $p-k$ PCA eigenvalues.

The constant $\sigma_1$ is for the regular covariance matrix estimated from the data whereas for Tyler's shape matrix it is simply a function of the dimension of the data.

The test statistic has a limiting chisquare distribution with $(p-k-1)(p-k+2)/2$ degrees of freedom.

Note that the regular covariance matrix is here divided by $n$ and not by $n-1$.

References

Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.

Examples

Run this code

# NOT RUN {
n <- 200
X <- cbind(rnorm(n, sd = 2), rnorm(n, sd = 1.5), rnorm(n), rnorm(n), rnorm(n))

TestCov <- PCAasymp(X, k = 2)
TestCov
TestTyler <- PCAasymp(X, k = 1, scatter = "tyler")
TestTyler
# }

Run the code above in your browser using DataLab