
The goodness-of-fit tests are based, by default, on the empirical
process comparing the empirical copula with a parametric estimate of
the copula derived under the null hypothesis, the default test
statistic, "Sn", being the Cramer-von Mises functional
Alternative test statistics can be used, in particular if a parametric bootstrap is employed.
The prinicipal function is gofCopula()
which, depending on
simulation
either calls gofPB()
or gofMB()
.
## Generic [and "rotCopula" method] ------ Main function ------
gofCopula(copula, x, …)
# S4 method for copula
gofCopula(copula, x, N = 1000,
method = c("Sn", "SnB", "SnC", "Rn"),
estim.method = c("mpl", "ml", "itau", "irho", "itau.mpl"),
simulation = c("pb", "mult"), test.method = c("family", "single"),
verbose = interactive(), ties = NA,
ties.method = c("max", "average", "first", "last", "random", "min"),
fit.ties.meth = eval(formals(rank)$ties.method), …)## (Deprecated) internal 'helper' functions : ---
gofPB(copula, x, N, method = c("Sn", "SnB", "SnC"),
estim.method = c("mpl", "ml", "itau", "irho", "itau.mpl"),
trafo.method = if(method == "Sn") "none" else c("cCopula", "htrafo"),
trafoArgs = list(), test.method = c("family", "single"),
verbose = interactive(), useR = FALSE, ties = NA,
ties.method = c("max", "average", "first", "last", "random", "min"),
fit.ties.meth = eval(formals(rank)$ties.method), …)
gofMB(copula, x, N, method = c("Sn", "Rn"),
estim.method = c("mpl", "ml", "itau", "irho"),
test.method = c("family", "single"), verbose = interactive(),
useR = FALSE, m = 1/2, zeta.m = 0, b = 1/sqrt(nrow(x)),
ties.method = c("max", "average", "first", "last", "random", "min"),
fit.ties.meth = eval(formals(rank)$ties.method), …)
a data matrix that will be transformed to pseudo-observations
using pobs()
.
number of bootstrap or multiplier replications to be used to obtain approximate realizations of the test statistic under the null hypothesis.
a character
string specifying the
goodness-of-fit test statistic to be used. For simulation = "pb"
,
one of "Sn", "SnB" or "SnC" with trafo.method != "none"
if
method != "Sn"
.
For simulation = "mult"
, one of
"Sn"
or "Rn"
, where the latter is
a string specifying the resampling method for
generating approximate realizations of the test statistic under the null
hypothesis; can be either "pb"
(parametric bootstrap) or
"mult"
(multiplier).
a character
string specifying the
test method to be used. Only in exceptional cases should this be
different from the default test.method = "family"
.
If test.method = "single"
, a test precisely for the provided
copula (not its parametric family) is conducted. This makes sense only for
specific applications such as testing random number generators.
a logical specifying if progress of the parametric bootstrap
should be displayed via txtProgressBar
.
for gofCopula
, additional arguments passed to
gofPB()
or gofMB()
;
for gofPB()
and gofMB()
: additional arguments passed
to fitCopula()
. These may notably contain
hideWarnings
, and
optim.method
, optim.control
, lower
,
or upper
depending on the optim.method
.
only for the parametric bootstrap. A
list
of optional arguments passed to the
transformation method (see trafo.method
above).
logical indicating whether an R or C implementation is used.
string specifying how ranks should be computed,
except for fitting, if there are ties in any of the coordinate
samples of x
; passed to pobs
.
string specifying how ranks should be computed
when fitting by maximum pseudo-likelihood if there are ties in any
of the coordinate samples of x
; passed to pobs
.
only for the parametric bootstrap. Logical indicating
whether a version of the parametric boostrap adapted to the
presence of ties in any of the coordinate samples of x
should be used; the default value of NA
indicates that the
presence/absence of ties will be checked for automatically.
only for the multiplier with method = "Rn"
.
m
is the power and zeta.m
is the adjustment
parameter
only for the multiplier. b
is the bandwidth required
for the estimation of the first-order partial derivatives based on
the empirical copula.
An object of class
htest
which is a list,
some of the components of which are
value of the test statistic.
corresponding approximate p-value.
estimates of the parameters for the hypothesized copula family.
If the parametric bootstrap is used, the dependence parameters of the hypothesized copula family can be estimated by any estimation method available for the family, up to a few exceptions. If the multiplier is used, any of the rank-based methods can be used in the bivariate case, but only maximum pseudo-likelihood estimation can be used in the multivariate (multiparameter) case.
The price to pay for the higher computational efficiency of the
multiplier is more programming work as certain
partial derivatives need to be computed for each hypothesized
parametric copula family. When estimation is based on maximization of
the pseudo-likelihood, these have been implemented for six copula
families thus far: the Clayton, Gumbel-Hougaard, Frank, Plackett,
normal and grad()
from package
numDeriv is used (and a warning message is displayed).
Although the empirical processes involved in the multiplier and the parametric bootstrap-based test are asymptotically equivalent under the null, the finite-sample behavior of the two tests might differ significantly.
Both for the parametric bootstrap and the multiplier,
the approximate p-value is computed as
For the normal and "ex"
for exchangeable, "ar1"
for AR(1),
"toep"
for Toeplitz, and "un"
for unstructured (see
ellipCopula()
). For the "df.fixed"
has to be set to TRUE
, which implies that the
degrees of freedom are not considered as a parameter to be estimated.
The former argument print.every
is deprecated and not
supported anymore; use verbose
instead.
Genest, C., Huang, W., and Dufour, J.-M. (2013). A regularized goodness-of-fit test for copulas. Journal de la Soci<U+00E9>t<U+00E9> fran<U+00E7>aise de statistique 154, 64--77.
Genest, C. and R<U+00E9>millard, B. (2008). Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Annales de l'Institut Henri Poincare: Probabilites et Statistiques 44, 1096--1127.
Genest, C., R<U+00E9>millard, B., and Beaudoin, D. (2009). Goodness-of-fit tests for copulas: A review and a power study. Insurance: Mathematics and Economics 44, 199--214.
Kojadinovic, I., Yan, J., and Holmes M. (2011). Fast large-sample goodness-of-fit tests for copulas. Statistica Sinica 21, 841--871.
Kojadinovic, I. and Yan, J. (2011). A goodness-of-fit test for multivariate multiparameter copulas based on multiplier central limit theorems. Statistics and Computing 21, 17--30.
Kojadinovic, I. and Yan, J. (2010). Modeling Multivariate Distributions with Continuous Margins Using the copula R Package. Journal of Statistical Software 34(9), 1--20, http://www.jstatsoft.org/v34/i09/.
Kojadinovic, I. (2017). Some copula inference procedures adapted to the presence of ties. Computational Statistics and Data Analysis 112, 24--41, http://arxiv.org/abs/1609.05519.
Pesarin, F. (2001). Multivariate Permutation Tests: With Applications in Biostatistics. Wiley.
fitCopula()
for the underlying estimation procedure and
gofTstat()
for details on *some* of the available test
statistics.
# NOT RUN {
## The following example is available in batch through
## demo(gofCopula)
# }
# NOT RUN {
n <- 200; N <- 1000 # realistic (but too large for interactive use)
n <- 60; N <- 200 # (time (and tree !) saving ...)
## A two-dimensional data example ----------------------------------
set.seed(271)
x <- rCopula(n, claytonCopula(3))
## Does the Gumbel family seem to be a good choice (statistic "Sn")?
gofCopula(gumbelCopula(), x, N=N)
## With "SnC", really s..l..o..w.. --- with "SnB", *EVEN* slower
gofCopula(gumbelCopula(), x, N=N, method = "SnC", trafo.method = "cCopula")
## What about the Clayton family?
gofCopula(claytonCopula(), x, N=N)
## Similar with a different estimation method
gofCopula(gumbelCopula (), x, N=N, estim.method="itau")
gofCopula(claytonCopula(), x, N=N, estim.method="itau")
## A three-dimensional example ------------------------------------
x <- rCopula(n, tCopula(c(0.5, 0.6, 0.7), dim = 3, dispstr = "un"))
## Does the Gumbel family seem to be a good choice?
g.copula <- gumbelCopula(dim = 3)
gofCopula(g.copula, x, N=N)
## What about the t copula?
t.copula <- tCopula(dim = 3, dispstr = "un", df.fixed = TRUE)
if(FALSE) ## this is *VERY* slow currently
gofCopula(t.copula, x, N=N)
## The same with a different estimation method
gofCopula(g.copula, x, N=N, estim.method="itau")
if(FALSE) # still really slow
gofCopula(t.copula, x, N=N, estim.method="itau")
## The same using the multiplier approach
gofCopula(g.copula, x, N=N, simulation="mult")
gofCopula(t.copula, x, N=N, simulation="mult")
if(FALSE) # no yet possible
gofCopula(t.copula, x, N=N, simulation="mult", estim.method="itau")
# }
# NOT RUN {
<!-- % dont.. -->
# }
Run the code above in your browser using DataLab