Learn R Programming

ICtest (version 0.3-5)

ICSboot: Boostrap-based Testing for the Number of Gaussian Components in NGCA Using Two Scatter Matrices

Description

In independent components analysis (ICA) gaussian components are considered as uninteresting. The function uses boostrappping tests, based on ICS using any combination of two scatter matrices, to decide if there are p-k gaussian components where p is the dimension of the data. The function offers two different boostrapping strategies.

Usage

ICSboot(X, k, S1=cov, S2=cov4, S1args=NULL, S2args=NULL, n.boot = 200, s.boot = "B1")

Arguments

X

a numeric data matrix with p>1 columns.

k

the number of non-gaussian components under the null.

S1

name of the first scatter matrix function. Can only return a matrix. Default is cov

S2

name of the second scatter matrix function. Can only return a matrix. Default is cov4

S1args

list with optional additional arguments for S1.

S2args

list with optional additional arguments for S2.

n.boot

number of bootstrapping samples.

s.boot

bootstrapping strategy to be used. Possible values are "B1", "B2". See details for further information.

Value

A list of class ictest inheriting from class htest containing:

statistic

the value of the test statistic.

p.value

the p-value of the test.

parameter

the number of boostrapping samples used to obtain the p-value.

method

character string which test was performed and which scatters were used.

data.name

character string giving the name of the data.

alternative

character string specifying the alternative hypothesis.

k

the number or non-gaussian components used in the testing problem.

W

the transformation matrix to the independent components. Also known as unmixing matrix.

S

data matrix with the centered independent components.

D

the underlying eigenvalues.

MU

the location of the data which was substracted before calculating the independent components.

s.boot

character string which boostrapping strategy was used.

Details

While in FOBIasymp and FOBIboot the two scatters used are always cov and cov4 this function can be used with any two scatter functions. In that case however the value of the Gaussian eigenvalues are in general not known and depend on the scatter functions used. Therefore the test uses as test statistic the k successive eigenvalues with the smallest variance. Which means the default here might differ from FOBIasymp and FOBIboot.

Given eigenvalues \(d_1,...,d_p\) the function thus orders the components in descending order according to the "variance" criterion .

Under the null it is then assumed that the first k interesting components are mutually independent and non-normal and the last p-k components are gaussian.

Let \(d_1,...,d_p\) be the ordered eigenvalues, \(W\) the correspondingly ordered unmixing matrix, \(s_i = W (x_i-MU)\) the corresponding source vectors which give the source matrix \(S\) which can be decomposed into \(S_1\) and \(S_2\) where \(S_1\) is the matrix with the \(k\) non-gaussian components and \(S_2\) the matrix with the gaussian components (under the null).

Two possible bootstrap tests are provided for testing that the last p-k components are gaussian and independent from the first k components:

  1. s.boot="B1": The first strategy has the followong steps:

    1. Take a bootstrap sample \(S_1^*\) of size \(n\) from \(S_1\).

    2. Take a bootstrap sample \(S_2^*\) consisting of a matrix with gaussian random variables having \(cov(S_2)\).

    3. Combine \(S^*=(S_1^*, S_2^*)\) and create \(X^*= S^* W\).

    4. Compute the test statistic based on \(X^*\).

    5. Repeat the previous steps n.boot times.

    Note that in this bootstrapping test the assumption of ''independent components'' is not used, it is only used that the last \(p-k\) components are gaussian and independent from the first \(k\) components. Therefore this strategy can be applied in an independent component analysis (ICA) framework and in a non-gaussian components analysis (NGCA) framework.

  2. s.boot="B2": The second strategy has the following steps:

    1. Take a bootstrap sample \(S_1^*\) of size \(n\) from \(S_1\) where the subsampling is done separately for each independent component.

    2. Take a bootstrap sample \(S_2^*\) consisting of a matrix with gaussian random variables having \(cov(S_2)\)

    3. Combine \(S^*=(S_1^*, S_2^*)\) and create \(X^*= S^* W\).

    4. Compute the test statistic based on \(X^*\).

    5. Repeat the previous steps n.boot times.

    This bootstrapping strategy assumes a full ICA model and cannot be used in an NGCA framework. Note that when the goal is to recover the non-gaussian independent components both scatters used must have the independence property.

References

Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.

Nordhausen, K., Oja, H., Tyler, D.E. and Virta, J. (2017), Asymptotic and Bootstrap Tests for the Dimension of the Non-Gaussian Subspace, Signal Processing Letters, 24, 887--891. <doi:10.1109/LSP.2017.2696880>.

Radojicic, U. and Nordhausen, K. (2020), Non-Gaussian Component Analysis: Testing the Dimension of the Signal Subspace. In Maciak, M., Pestas, M. and Schindler, M. (editors) "Analytical Methods in Statistics. AMISTAT 2019", 101--123, Springer, Cham. <doi:10.1007/978-3-030-48814-7_6>.

See Also

ics, FOBIboot, FOBIasymp

Examples

Run this code
# NOT RUN {
n <- 750
S <- cbind(runif(n), rchisq(n, 2), rexp(n), rnorm(n), rnorm(n), rnorm(n))
A <- matrix(rnorm(36), ncol = 6)
X <- S %*% t(A)

# n.boot is small for demonstration purpose, should be larger
ICSboot(X, k=1, n.boot=20)

if(require("ICSNP")){

myTyl <- function(X,...) HR.Mest(X,...)$scatter
myT <- function(X,...) tM(X,...)$V

# n.boot is small for demonstration purpose, should be larger
ICSboot(X, k=3, S1=myT, S2=myTyl, s.boot = "B2", n.boot=20)
}
# }

Run the code above in your browser using DataLab