gofCopula: Goodness-of-fit Tests for Copulas

Description

Goodness-of-fit tests for copulas based on the empirical process comparing the empirical copula with a parametric estimate of the copula derived under the null hypothesis. Approximate p-values for the test statistic can be obtained either using the parametric bootstrap (see the two first references) or by means of a fast multiplier approach (see references three and four).

The default test statistic, "Sn", is the Cramer-von Mises functional $\mathrm{S_n}$ defined in Equation (2) of Genest, Remillard and Beaudoin (2009).

The prinicipal function is gofCopula() which, depending on simulation either calls gofPB() or gofMB().

Usage

gofCopula(copula, x, N=1000,
          method=eval(formals(gofTstat)$method),
          estim.method=eval(formals(fitCopula)$method),
          simulation=c("pb", "mult"),
	  verbose=TRUE, print.every=NULL,
          optim.method="BFGS", optim.control=list(maxit=20), ...)
gofPB(copula, x, N, method = eval(formals(gofTstat)$method),
      estim.method = eval(formals(fitCopula)$method),
      optim.method = "BFGS", optim.control,
      trafo.method = c("none", "rtrafo", "htrafo"), verbose = TRUE, ...)
gofMB(copula, x, N, method = c("Sn", "Rn"),
      estim.method = eval(formals(fitCopula)$method),
      optim.method = "BFGS", optim.control, verbose = TRUE, useR = FALSE,
      m = 1/2, zeta.m = 0, b = 0.05)

Arguments

copula

object of class "copula" representing the hypothesized copula family.

a data matrix that will be transformed to pseudo-observations.

number of bootstrap or multiplier replications to be used to simulate realizations of the test statistic under the null hypothesis.

method

a character string specifying the goodness-of-fit test statistic to be used; currently, one of "Sn", "SnB", "SnC", "AnChisq", or "AnGamma", see gofTstat

estim.method

a character string specifying the estimation method to be used to estimate the dependence parameter(s); see fitCopula().

simulation

a string specifying the simulation method for generating realizations of the test statistic under the null hypothesis; can be either "pb" (parametric bootstrap) or "mult" (multiplier).

print.every

is deprecated in favor of verbose.

verbose

a logical specifying if progress of the bootstrap should be displayed via txtProgressBar.

optim.method, optim.control

the method and control arguments for optim(), see there.

...

for gofCopula, additional arguments passed to gofPB() or gofMB();

for gofPB(): additional arguments passed to htrafo() or rtrafo().

trafo.method

string specifying the transformation to $U[0,1]^d$; either "none" or one of "rtrafo", see rtrafo, or "htrafo", see htrafo

useR

logical indicating whether an R or the C implementation is used.

m, zeta.m, b

only for method "Rn" in MB, the multiplier bootstrap. m is the power, zeta.m the adjustment parameter $\zeta_m$ for the denominator of the test statistic, and b is the bandwi

Value

An object of class htest which is a list, some of the components of which are
statisticvalue of the test statistic.
p.valuecorresponding approximate p-value.
parameterestimates of the parameters for the hypothesized copula family.

Details

If the parametric bootstrap is used, the dependence parameters of the hypothesized copula family can be estimated either by maximizing the pseudo-likelihood, by inverting Kendall's tau, or by inverting Spearman's rho. If the multiplier is used, any estimation method can be used in the bivariate case, but only maximum pseudo-likelihood estimation can be used in the multivariate (multiparameter) case.

For the normal and t copulas, several dependence structures can be hypothesized: "ex" for exchangeable, "ar1" for AR(1), "toep" for Toeplitz, and "un" for unstructured (see ellipCopula()). For the t copula, "df.fixed" has to be set to TRUE, which implies that the degrees of freedom are not considered as a parameter to be estimated.

Thus far, the multiplier approach is implemented for six copula families: the Clayton, Gumbel, Frank, Plackett, normal and t.

Although the processes involved in the multiplier and the parametric bootstrap-based test are asymptotically equivalent under the null, note that the finite-sample behavior of the two tests might differ significantly.

Also note that in the case of the parametric and multiplier bootstraps, the approximate p-value is computed as $$(0.5 +\sum_{b=1}^N\mathbf{1}_{{T_b\ge T}})/(N+1),$$ where $T$ and $T_b$ denote the test statistic and the bootstrapped test statistc, respectively. This ensures that the approximate p-value is a number strictly between 0 and 1, which is sometimes necessary for further treatments. See Pesarin (2001) for more details.

References

Genest, C. and Ré{e}millard, B. (2008). Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Annales de l'Institut Henri Poincare: Probabilites et Statistiques 44, 1096--1127.

Genest, C., Ré{e}millard, B., and Beaudoin, D. (2009). Goodness-of-fit tests for copulas: A review and a power study. Insurance: Mathematics and Economics 44, 199--214.

Kojadinovic, I., Yan, J., and Holmes M. (2011). Fast large-sample goodness-of-fit tests for copulas. Statistica Sinica 21, 841--871.

Kojadinovic, I. and Yan, J. (2011). A goodness-of-fit test for multivariate multiparameter copulas based on multiplier central limit theorems. Statistics and Computing 21, 17--30.

Kojadinovic, I. and Yan, J. (2010). Modeling Multivariate Distributions with Continuous Margins Using the copula R Package. Journal of Statistical Software 34(9), 1--20. http://www.jstatsoft.org/v34/i09/.

Pesarin, F. (2001). Multivariate Permutation Tests: With Applications in Biostatistics. Wiley.

Examples

Run this code

## the following example is available in batch through
## demo(gofCopula)
## A two-dimensional data example ----------------------------------
x <- rCopula(200, claytonCopula(3))

(tau. <- cor(x, method="kendall")[1,2]) # around 0.5 -- 0.6
## Does the Gumbel family seem to be a good choice?
thG <- iTau(gumbelCopula(), tau.)
gofCopula(gumbelCopula(thG), x)
# SnC: really s..l..o..w.. --- SnB is *EVEN* slower
gofCopula(gumbelCopula(thG), x, method = "SnC")
## What about the Clayton family?
thC <- iTau(claytonCopula(), tau.)
gofCopula(claytonCopula(thC), x)
gofCopula(claytonCopula(thC), x, method = "AnChisq")

## The same with a different estimation method
gofCopula(gumbelCopula (thG), x, estim.method="itau")
gofCopula(claytonCopula(thC), x, estim.method="itau")



## A three-dimensional example  ------------------------------------
x <- rCopula(200, tCopula(c(0.5, 0.6, 0.7), dim = 3, dispstr = "un"))

## Does the Clayton family seem to be a good choice?
gofCopula(gumbelCopula(1, dim = 3), x)
## What about the t copula?
t.copula <- tCopula(rep(0, 3), dim = 3, dispstr = "un", df.fixed=TRUE)
## this is *VERY* slow currentlygofCopula(t.copula, x)

## The same with a different estimation method
gofCopula(gumbelCopula(1, dim = 3), x, estim.method="itau")
gofCopula(t.copula,                 x, estim.method="itau")

## The same using the multiplier approach
gofCopula(gumbelCopula(1, dim = 3), x, simulation="mult")
gofCopula(t.copula,                 x, simulation="mult")

Run the code above in your browser using DataLab