Learn R Programming

copBasic (version 2.2.7)

EMPIRcop: The Bivariate Empirical Copula

Description

The bivariate empirical copula (Nelsen, 2006, p. 219) for a bivariate sample of length \(n\) is defined for random variables \(X\) and \(Y\) as

$$\mathbf{C}_n\biggl(\frac{i}{n}, \frac{j}{n}\biggr) = \frac{\mathrm{number\ of\ pairs\ (}x,y\mathrm{)\ with\ }x \le x_{(i)}\mathrm{\ and\ }y \le y_{(j)}}{n}\mbox{,}$$

where \(x_{(i)}\) and \(y_{(i)}\), \(1 \le i,j \le n\) or expressed as $$\mathbf{C}_n\biggl(\frac{i}{n}, \frac{j}{n}\biggr) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}\biggl(\frac{R_i}{n} \le u_i, \frac{S_i}{n} \le v_i \biggr)\mbox{,}$$ where \(R_i\) and \(S_i\) are ranks of the data for \(U\) and \(V\), and \(\mathbf{1}(.)\) is an indicator function that score 1 if condition is true otherwise scoring zero. Using more generic notation, the empirical copula can be defined by $$\mathbf{C}_{n}(u,v) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}\bigl(u^\mathrm{obs}_{i} \le u_i, v^\mathrm{obs}_{i} \le v_i \bigr)\mbox{,}$$ where \(u^\mathrm{obs}\) and \(v^\mathrm{obs}\) are thus some type of nonparametric nonexceedance probabilities based on counts of the underlying data expressed in probabilities.

Hazen Empirical Copula---The “Hazen form” of the empirical copula is $$\mathbf{C}^\mathcal{H}_{n}(u,v) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}\biggl(\frac{R_i - 0.5}{n} \le u_i, \frac{S_i - 0.5}{n} \le v_i \biggr)\mbox{,}$$ which can be triggered by ctype="hazen". This form is named for this package because of direct similarity of the Hazen plotting position to the above definition. Joe (2014, pp. 247--248) uses the Hazen form. Joe continues by saying “[the] adjustment of the uniform score [\((R - 0.5)/n]\)] could be done in an alternative form, but there is [asymptotic] equivalence[, and that] \(\mathbf{C}^\mathcal{H}_{n}\) puts mass of \(n^{-1}\) at the tuples \(([r_{i1} - 0.5]/n, \ldots, [r_{id} - 0.5]/n)\) for \(i = 1, \ldots, n\).” A footnote by Joe (2014) says that “the conversion [\(R/(n+1)\)] is commonly used for the empirical copula.” This later form is the “Weibull form” described next. Joe's preference for the Hazen form is so that the sum of squared normal scores is closer to unity for large \(n\) than such a sum would be attained using the Weibull form.

Weibull Empirical Copula---The “Weibull form” of the empirical copula is $$\mathbf{C}^\mathcal{W}_{n}(u,v) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}\biggl(\frac{R_i}{n+1} \le u_i, \frac{S_i}{n+1} \le v_i \biggr)\mbox{,}$$ which can be triggered by ctype="weibull". This form is named for this package because of direct similarity of the Weibull plotting position to the definition, and this form is the default (see argument description).

Bernstein Empirical Copula---The empirical copula can be extended nonparametrically as the Bernstein empirical copula (Hernández-Maldonado, Díaz-Viera, and Erdely, 2012) and is formulated as

$$\mathbf{C}^\mathcal{B}_n(u,v; \eta) = \sum_{i=1}^n\sum_{j=1}^n \mathbf{C}_{n}\biggl(\frac{i}{n},\frac{j}{n}\biggr) \times \eta(i,j; u,v)\mbox{,}$$ where the individual Bernstein weights \(\eta(i,j)\) for the \(k\)th paired value of the \(u\) and \(v\) vectors are

$$\eta(i,j; u,v) = {n \choose i} u^i (1-u)^{n-i} \times {n \choose j} u^j (1-u)^{n-j}\mbox{.}$$

The Bernstein extension, albeit conceptually pure in its shuffling by binomial coefficients and left- and right-tail weightings, is quite CPU intensive as inspection of the equations above indicates a nest of four for() loops in R. (The native R code of copBasic uses the sapply() function in R liberally for substantial but not a blazing fast speed increase.) The Bernstein extension results in a smoother surface of the empirical copula and can be triggered by ctype="bernstein".

Checkerboard Empirical Copula---A simple smoothing to the empirical copula is the checkerboard empirical copula (Segers et al., 2017) that has been adapted from the copula package. It is numerically intensive like the Bernstein and possibly of limited usefulness for large sample sizes. The checkerboard extension can be triggered by ctype="checkerboard" and is formulated as

$$\mathbf{C}^\sharp_{n}(U) = \frac{1}{n+o} \sum_{i=1}^n\prod_{i=1}^d \mathrm{min}[\mathrm{max}[n U_j - R^{(n)}_{i,j} + 1,0],1]\mathrm{,}$$

where \(U\) is a \(d=2\) column matrix of \(u\) and \(v\), \(R\) is a rank function, and \(o\) is an offset term on \([0,1]\).

The empirical copula frequency can be defined (Nelson, 2006, p. 219) as $$\mathbf{c}_n(u, v) = \mathbf{C}_n\biggl(\frac{i}{n}, \frac{j}{n}\biggr) - \mathbf{C}_n\biggl(\frac{i-1}{n}, \frac{j}{n}\biggr) - \mathbf{C}_n\biggl(\frac{i}{n}, \frac{j-1}{n}\biggr) + \mathbf{C}_n\biggl(\frac{i-i}{n}, \frac{j-1}{n}\biggr)\mbox{.}$$

Usage

EMPIRcop(u, v, para=NULL,
               ctype=c("weibull", "hazen", "1/n", "bernstein", "checkerboard"),
                          bernprogress=FALSE, checkerboard.offset=0, ...)

Value

Value(s) for the copula are returned.

Arguments

u

Nonexceedance probability \(u\) in the \(X\) direction;

v

Nonexceedance probability \(v\) in the \(Y\) direction;

para

A vector (single element) of parameters---the U-statistics of the data (see Examples). Alternatively, para can be a list holding a para as would be done if it were a vector, but arguments bernstein and bernprogress can be optionally included---this feature is provided so that the Bernstein refinement can be controlled within the context of other functions calling EMPIRcop such as by level.curvesCOP;

ctype

An alternative means for trigging the definition of \(\mathbf{C}_n\), \(\mathbf{C}^\mathcal{H}_n\) (default), \(\mathbf{C}^\mathcal{W}_n\), \(\mathbf{C}^\mathcal{B}_n\), or \(\mathbf{C}^\sharp_n\). This argument of the same name is also used by blomCOP;

bernprogress

The Bernstein copula extension is CPU intensive(!), so a splash counter is pushed to the console via the message() function in R so as to not discourage the user;

checkerboard.offset

A scaling of the ratio sum(....)/(n+offset) for the checkerboard empirical copula; and

...

Additional arguments to pass.

Author

W.H. Asquith

References

Hernández-Maldonado, V., Díaz-Viera, M., and Erdely, A., 2012, A joint stochastic simulation method using the Bernstein copula as a flexible tool for modeling nonlinear dependence structures between petrophysical properties: Journal of Petroleum Science and Engineering, v. 90--91, pp. 112--123.

Nelsen, R.B., 2006, An introduction to copulas: New York, Springer, 269 p.

Salvadori, G., De Michele, C., Kottegoda, N.T., and Rosso, R., 2007, Extremes in Nature---An approach using copulas: Springer, 289 p.

Segers, J., Sibuya, M., and Tsukahara, H., 2017, The empirical beta copula: Journal of Multivariate Analysis, v. 155, pp. 35--51.

See Also

diagCOP, level.curvesCOP, simCOP