indepTest: Test Independence of Continuous Random Variables via Empirical Copula

Description

Multivariate independence test based on the empirical copula process as proposed by Christian Genest and Bruno Rémillard. The test can be seen as composed of three steps: (i) a simulation step, which consists of simulating the distribution of the test statistics under independence for the sample size under consideration; (ii) the test itself, which consists of computing the approximate p-values of the test statistics with respect to the empirical distributions obtained in step (i); and (iii) the display of a graphic, called a dependogram, enabling to understand the type of departure from independence, if any. More details can be found in the articles cited in the reference section.

Usage

indepTestSim(n, p, m = p, N = 1000, verbose = TRUE, print.every = NULL)
indepTest(x, d, alpha=0.05)
dependogram(test, pvalues = FALSE, print = FALSE)

Arguments

sample size when simulating the distribution of the test statistics under independence.

dimension of the data when simulating the distribution of the test statistics under independence.

maximum cardinality of the subsets of variables for which a test statistic is to be computed. It makes sense to consider $m \ll p$ especially when p is large.

number of repetitions when simulating under independence.

print.every

is deprecated in favor of verbose.

verbose

a logical specifying if progress should be displayed via txtProgressBar.

data frame or data matrix containing realizations (one per line) of the random vector whose independence is to be tested.

object of class "indepTestDist" as returned by the function indepTestSim(). It can be regarded as the empirical distribution of the test statistics under independence.

alpha

significance level used in the computation of the critical values for the test statistics.

test

object of class "indepTest" as returned by indepTest().

pvalues

logical indicating whether the dependogram should be drew from test statistics or the corresponding p-values.

logical indicating whether details should be printed.

Value

The function indepTestSim() returns an object of class "indepTestDist" whose attributes are: sample.size, data.dimension, max.card.subsets, number.repetitons, subsets (list of the subsets for which test statistics have been computed), subsets.binary (subsets in binary 'integer' notation), dist.statistics.independence (a N line matrix containing the values of the test statistics for each subset and each repetition) and dist.global.statistic.independence (a vector a length N containing the values of the global Cramér-von Mises test statistic for each repetition - see last reference p.~175).
The function indepTest() returns an object of class "indepTest" whose attributes are: subsets, statistics, critical.values, pvalues, fisher.pvalue (a p-value resulting from a combination à la Fisher of the subset statistic p-values), tippett.pvalue (a p-value resulting from a combination à la Tippett of the subset statistic p-values), alpha (global significance level of the test), beta (1 - beta is the significance level per statistic), global.statistic (value of the global Cramér-von Mises statistic derived directly from the independence empirical copula process - see last reference p.~175) and global.statistic.pvalue (corresponding p-value).

Details

See the references below for more details, especially the third one.

References

Deheuvels, P. (1979). La fonction de dépendance empirique et ses propriétés: un test non paramétrique d'indépendance, Acad. Roy. Belg. Bull. Cl. Sci., 5th Ser. 65, 274--292.

Deheuvels, P. (1981) A non parametric test for independence, Publ. Inst. Statist. Univ. Paris. 26, 29--50.

Genest, C. and Ré{e}millard, B. (2004) Tests of independence and randomness based on the empirical copula process. Test 13, 335--369.

Genest, C., Quessy, J.-F., and Ré{e}millard, B. (2006). Local efficiency of a Cramer-von Mises test of independence, Journal of Multivariate Analysis 97, 274--294.

Genest, C., Quessy, J.-F., and Ré{e}millard, B. (2007) Asymptotic local efficiency of Cramér-von Mises tests for multivariate independence. The Annals of Statistics 35, 166--191.

Examples

Run this code

## Consider the following example taken from
## Genest and Remillard (2004), p 352:

x <- matrix(rnorm(500),100,5)
x[,1] <- abs(x[,1]) * sign(x[,2] * x[,3])
x[,5] <- x[,4]/2 + sqrt(3) * x[,5]/2

## In order to test for independence "within" x, the first step consists
## in simulating the distribution of the test statistics under
## independence for the same sample size and dimension,
## i.e. n=100 and p=5. As we are going to consider all the subsets of
## {1,...,5} whose cardinality is between 2 and 5, we set p=m=5.
## This may take a while...

if(copula:::doExtras()) { ## not run, typically:% ------------------------------
print(system.time(
d <- indepTestSim(100,5)
))

## The next step consists of performing the test itself:

test <- indepTest(x,d)
## Let us see the results:
print(test)

## Display the dependogram with the details:
dependogram(test, print=TRUE)
}# (not in CRAN checks)% ------------------ not on CRAN ------------------------

## We could have tested for a weaker form of independence, for instance,
## by only computing statistics for subsets whose cardinality is between 2
## and 3. Consider for instance the following data:
y <- matrix(runif(500),100,5)
## and perform the test:
d <- indepTestSim(100,5,3)
test <- indepTest(y,d)
test
dependogram(test,print=TRUE)

## NB: In order to save d for future use, the save function can be used.

Run the code above in your browser using DataLab