C.n: The Empirical Copula

Description

Given a random sample from a distribution with continuous margins and copula C, the empirical copula is a natural nonparametric estimator of C. The function C.n() computes the empirical copula.

The function dCn() approximates first-order partial derivatives of the unknown copula.

Usage

C.n(u, U, offset=0, method=c("C", "R"))
F.n(x, X, offset=0, method=c("C", "R"))
dCn(u, U, j.ind=1:d, b=1/sqrt(nrow(U)), ...)
Cn(x, w) ## <-- deprecated!  use  C.n(w, U=pobs(x)) instead!

Arguments

u,x,w

an $(m, d)$-matrix with elements in $[0,1]$ whose rows contain the evaluation points of the empirical copula.

U,X

(and x for Cn():) an $(n, d)$-matrix, for C.n() and Cn() with elements in $[0,1]$ and with the same number $d$ of columns as u (

j.ind

integer vector of indices $j$ between 1 and $d$ indicating the dimensions with respect to which first-order partial derivatives are approximated.

numeric giving the bandwidth for approximating first-order partial derivatives.

offset

used in scaling the result which is of the form sum(....)/(n+offset); defaults to zero.

method

character string indicating which method is applied to compute the empirical CDF or copula. method="C" uses a an implementation in C, method="R" uses an Rimplementati

...

additional arguments passed to C.n().

Value

C.n() and F.n() a numeric vector of length m with the values for C.n() of the empirical copula of U at u, and for F.n() of the empirical CDF (cumulative distribution function) of X at x. dCn() returns a $(m, l)$-matrix or an $m$-vector (for $l=1$; here, $l$ is the length of j.ind), containing the approximated first-order partial derivatives of the unknown copula at u.

Details

There are several asymptotically equivalent definitions of the empirical copula. Here, the empirical copula is simply defined as the empirical distribution function computed from the pseudo-observations, that is, $$C_n(\bm{u})=\frac{1}{n}\sum_{i=1}^n\mathbf{1}_{{\hat{\bm{U}}_i\le\bm{u}}},$$ where $\hat{\bm{U}}_i$, $i\in{1,\dots,n}$, denote the pseudo-observations (rows in U) and $n$ the sample size.

The approximation for the $j$th partial derivative of the unknown copula $C$ is implemented as, for example, in Ré{e'}millard and Scaillet (2009), and given by $$\hat{\dot{C}}_{jn}(\bm{u})=\frac{C_n(u_1,..,u_{j-1},min(u_j+b,1),u_{j+1},..,u_d)-C_n(u_1,..,u_{j-1},max(u_j-b,0),u_{j+1},..,u_d)}{2b},$$ where $b$ denotes the bandwidth and $C_n$ the empirical copula.

References

Rü{u}schendorf, L. (1976). Asymptotic distributions of multivariate rank order statistics, Annals of Statistics 4, 912--923.

Deheuvels, P. (1979). La fonction de dé{e'}pendance empirique et ses propriété{e'te'}s: un test non paramé{e'}trique d'indé{e'}pendance, Acad. Roy. Belg. Bull. Cl. Sci., 5th Ser. 65, 274--292.

Deheuvels, P. (1981). A non parametric test for independence, Publ. Inst. Statist. Univ. Paris 26, 29--50.

Ré{e}millard, B. and Scaillet, O. (2009). Testing for equality between two copulas. Journal of Multivariate Analysis, 100(3), pages 377-386.

Examples

Run this code

n <- 100
d <- 3
family <- "Gumbel"
theta <- 2
cop <- onacopulaL(family, list(theta=theta, 1:d))
set.seed(1)
U <- rCopula(n, cop)

## random points were to evaluate the empirical copula
u <- matrix(runif(n*d), n, d)
ec <- C.n(u, U=U)

## compare with true distribution function
mean(abs(pCopula(u, copula=cop)-ec)) # increase n to decrease this error

## compare the empirical copula and the true copula
## on the diagonal of the unit square
Cn. <- function(x) C.n(do.call(cbind, rep(list(x), d)), U=U)
curve(Cn., 0, 1, main=paste("Diagonal of a", family, "copula"),
      xlab="u", ylab=expression(italic(C)[n](italic(u),..,italic(u))))
pC <- function(x) pCopula(do.call(cbind, rep(list(x), d)), cop)
curve(pC, lty=2, add=TRUE)
legend("topleft", lty=1:2, bty="n", inset=0.02,
       legend=c(expression(italic(C)[n]), expression(italic(C))))

## check the empirical copula with its Kendall distribution function
plot( pK(C.n(U, U=U), cop=cop@copula, d=d) ) # must be uniform

## approximate partial derivatives w.r.t. the 2nd and 3rd component
j.ind <- 2:3
der23 <- dCn(u, U=pobs(U), j.ind=j.ind)
der23. <- copula:::dCdu(archmCopula(family, param=theta, dim=d), u=u)[,j.ind]
summary(as.vector(abs(der23-der23.))) # approximation error summary
U <- U[1:64 ,]# such that m != n
  stopifnot(suppressWarnings( ## deprecation warning ..
    identical(C.n(u, pobs(U)),
              Cn (U, u))))
## For an example of using F.n(), see  help(mvdc)% ./Mvdc.Rd

Run the code above in your browser using DataLab