In independent components analysis (ICA) gaussian components are considered as uninteresting.
The function uses boostrappping tests, based on ICS using any combination of two scatter matrices, to decide if there are p-k
gaussian components where p
is the dimension of the data.
The function offers two different boostrapping strategies.
ICSboot(X, k, S1=cov, S2=cov4, S1args=NULL, S2args=NULL, n.boot = 200, s.boot = "B1")
a numeric data matrix with p>1 columns.
the number of non-gaussian components under the null.
name of the first scatter matrix function. Can only return a matrix. Default is cov
name of the second scatter matrix function. Can only return a matrix. Default is cov4
list with optional additional arguments for S1
.
list with optional additional arguments for S2
.
number of bootstrapping samples.
bootstrapping strategy to be used. Possible values are "B1"
, "B2"
. See details for further information.
A list of class ictest inheriting from class htest containing:
the value of the test statistic.
the p-value of the test.
the number of boostrapping samples used to obtain the p-value.
character string which test was performed and which scatters were used.
character string giving the name of the data.
character string specifying the alternative hypothesis.
the number or non-gaussian components used in the testing problem.
the transformation matrix to the independent components. Also known as unmixing matrix.
data matrix with the centered independent components.
the underlying eigenvalues.
the location of the data which was substracted before calculating the independent components.
character string which boostrapping strategy was used.
While in FOBIasymp
and FOBIboot
the two scatters used are always cov
and cov4
this function can be used with any two scatter functions. In that case however the value of the Gaussian eigenvalues are in general not known and depend on the scatter functions used. Therefore the test uses as test statistic the k
successive eigenvalues with the smallest variance. Which means the default here might differ from FOBIasymp
and FOBIboot
.
Given eigenvalues \(d_1,...,d_p\) the function thus orders the components in descending order according to the "variance" criterion .
Under the null it is then assumed that the first k
interesting components are mutually independent and non-normal and the last p-k
components are gaussian.
Let \(d_1,...,d_p\) be the ordered eigenvalues, \(W\) the correspondingly ordered unmixing matrix, \(s_i = W (x_i-MU)\) the corresponding source vectors which give the source matrix \(S\) which can be decomposed into \(S_1\) and \(S_2\) where \(S_1\) is the matrix with the \(k\) non-gaussian components and \(S_2\) the matrix with the gaussian components (under the null).
Two possible bootstrap tests are provided for testing that the last p-k
components are gaussian and independent from the first k components:
s.boot="B1"
:
The first strategy has the followong steps:
Take a bootstrap sample \(S_1^*\) of size \(n\) from \(S_1\).
Take a bootstrap sample \(S_2^*\) consisting of a matrix with gaussian random variables having \(cov(S_2)\).
Combine \(S^*=(S_1^*, S_2^*)\) and create \(X^*= S^* W\).
Compute the test statistic based on \(X^*\).
Repeat the previous steps n.boot
times.
Note that in this bootstrapping test the assumption of ''independent components'' is not used, it is only used that the last \(p-k\) components are gaussian and independent from the first \(k\) components. Therefore this strategy can be applied in an independent component analysis (ICA) framework and in a non-gaussian components analysis (NGCA) framework.
s.boot="B2"
:
The second strategy has the following steps:
Take a bootstrap sample \(S_1^*\) of size \(n\) from \(S_1\) where the subsampling is done separately for each independent component.
Take a bootstrap sample \(S_2^*\) consisting of a matrix with gaussian random variables having \(cov(S_2)\)
Combine \(S^*=(S_1^*, S_2^*)\) and create \(X^*= S^* W\).
Compute the test statistic based on \(X^*\).
Repeat the previous steps n.boot
times.
This bootstrapping strategy assumes a full ICA model and cannot be used in an NGCA framework. Note that when the goal is to recover the non-gaussian independent components both scatters used must have the independence property.
Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.
Nordhausen, K., Oja, H., Tyler, D.E. and Virta, J. (2017), Asymptotic and Bootstrap Tests for the Dimension of the Non-Gaussian Subspace, Signal Processing Letters, 24, 887--891. <doi:10.1109/LSP.2017.2696880>.
Radojicic, U. and Nordhausen, K. (2020), Non-Gaussian Component Analysis: Testing the Dimension of the Signal Subspace. In Maciak, M., Pestas, M. and Schindler, M. (editors) "Analytical Methods in Statistics. AMISTAT 2019", 101--123, Springer, Cham. <doi:10.1007/978-3-030-48814-7_6>.
# NOT RUN {
n <- 750
S <- cbind(runif(n), rchisq(n, 2), rexp(n), rnorm(n), rnorm(n), rnorm(n))
A <- matrix(rnorm(36), ncol = 6)
X <- S %*% t(A)
# n.boot is small for demonstration purpose, should be larger
ICSboot(X, k=1, n.boot=20)
if(require("ICSNP")){
myTyl <- function(X,...) HR.Mest(X,...)$scatter
myT <- function(X,...) tM(X,...)$V
# n.boot is small for demonstration purpose, should be larger
ICSboot(X, k=3, S1=myT, S2=myTyl, s.boot = "B2", n.boot=20)
}
# }
Run the code above in your browser using DataLab