Divergence based regression for compositional data: Divergence based regression for compositional data

Description

Regression for compositional data based on the Kullback-Leibler the Jensen-Shannon divergence and the symmetric Kullback-Leibler divergence.

Usage

kl.compreg(y, x, B = 1, ncores = 1, xnew = NULL, tol = 1e-07, maxiters = 50)
js.compreg(y, x, B = 1, ncores = 1, xnew = NULL)
tv.compreg(y, x, B = 1, ncores = 1, xnew = NULL)
symkl.compreg(y, x, B = 1, ncores = 1, xnew = NULL)

Arguments

A matrix with the compositional data (dependent variable). Zero values are allowed.

The predictor variable(s), they can be either continnuous or categorical or both.

If B is greater than 1 bootstrap estimates of the standard error are returned. If B=1, no standard errors are returned.

ncores

If ncores is 2 or more parallel computing is performed. This is to be used for the case of bootstrap. If B=1, this is not taken into consideration.

xnew

If you have new data use it, otherwise leave it NULL.

tol

The tolerance value to terminate the Newton-Raphson procedure.

maxiters

The maximum number of Newton-Raphson iterations.

Value

A list including:

runtime

The time required by the regression.

iters

The number of iterations required by the Newton-Raphson in the kl.compreg function.

loglik

The log-likelihood. This is actually a quasi multinomial regression. This is bascially minus the half deviance, or \(- \sum_{i=1}^ny_i\log{y_i/\hat{y}_i}\).

The beta coefficients.

covbe

The covariance matrix of the beta coefficients, if bootstrap is chosen, i.e. if B > 1.

est

The fitted values of xnew if xnew is not NULL.

Details

In the kl.compreg the Kullback-Leibler divergence is adopted as the objective function. In case of problematic convergence the "multinom" function by the "nnet" package is employed. This will obviously be slower. The js.compreg uses the Jensen-Shannon divergence and the symkl.compreg uses the symmetric Kullback-Leibler divergence. The tv.compreg uses the Total Variation divergence. There is no actual log-likelihood for neither regression.

References

Murteira, Jose MR, and Joaquim JS Ramalho 2016. Regression analysis of multivariate fractional data. Econometric Reviews 35(4): 515-552.

Tsagris, Michail (2015). A novel, divergence based, regression for compositional data. Proceedings of the 28th Panhellenic Statistics Conference, 15-18/4/2015, Athens, Greece. https://arxiv.org/pdf/1511.07600.pdf

Endres, D. M. and Schindelin, J. E. (2003). A new metric for probability distributions. Information Theory, IEEE Transactions on 49, 1858-1860.

Osterreicher, F. and Vajda, I. (2003). A new class of metric divergences on probability spaces and its applicability in statistics. Annals of the Institute of Statistical Mathematics 55, 639-653.

Examples

Run this code

# NOT RUN {
library(MASS)
x <- as.vector(fgl[, 1])
y <- as.matrix(fgl[, 2:9])
y <- y / rowSums(y)
mod1<- kl.compreg(y, x, B = 1, ncores = 1)
mod2 <- js.compreg(y, x, B = 1, ncores = 1)
# }

Run the code above in your browser using DataLab