Tuning of the divergence based regression for compositional data with compositional data in the covariates side using the alpha-transformation: Tuning of the divergence based regression for compositional data with compositional data in the covariates side using the \(\alpha\)-transformation

Description

Tuning of the divergence based regression for compositional data with compositional data in the covariates side using the \(\alpha\)-transformation.

Usage

klalfapcr.tune(y, x, covar = NULL, nfolds = 10, maxk = 50, a = seq(-1, 1, by = 0.1),
folds = NULL, graph = FALSE, tol = 1e-07, maxiters = 50, seed = NULL)

Arguments

A numerical matrix with compositional data with or without zeros.

A matrix with the predictor variables, the compositional data. Zero values are allowed.

covar

If you have other continuous covariates put themn here.

nfolds

The number of folds for the K-fold cross validation, set to 10 by default.

maxk

The maximum number of principal components to check.

The value of the power transformation, it has to be between -1 and 1. If zero values are present it has to be greater than 0. If \(\alpha=0\) the isometric log-ratio transformation is applied.

folds

If you have the list with the folds supply it here. You can also leave it NULL and it will create folds.

graph

If graph is TRUE (default value) a plot will appear.

tol

The tolerance value to terminate the Newton-Raphson procedure.

maxiters

The maximum number of Newton-Raphson iterations.

seed

You can specify your own seed number here or leave it NULL.

Value

A list including:

mspe

A list with the KL divergence for each value of \(\alpha\) and k in every fold.

performance

A matrix with the KL divergence for each value of \(\alpha\) averaged over all folds. If graph is set to TRUE this matrix is plotted.

best.perf

The minimum KL divergence.

params

The values of \(\alpha\) and k corresponding to the minimum KL divergence.

Details

The M-fold cross validation is performed in order to select the optimal values for \(\alpha\) and k, the number of principal components. The \(\alpha\)-transformation is applied to the compositional data first, the first k principal component scores are calcualted and used as predictor variables for the Kullback-Leibler divergence based regression model. This procedure is performed M times during the M-fold cross validation.

References

Alenazi A. (2019). Regression for compositional data with compositioanl data as predictor variables with or without zero values. Journal of Data Science, 17(1): 219-238. http://www.jds-online.com/file_download/688/01+No.10+315+REGRESSION+FOR+COMPOSITIONAL+DATA+WITH+COMPOSITIONAL+DATA+AS+PREDICTOR+VARIABLES+WITH+OR+WITHOUT+ZERO+VALUES.pdf

Tsagris M. (2015). Regression analysis with compositional data containing zero values. Chilean Journal of Statistics, 6(2): 47-57. http://arxiv.org/pdf/1508.01913v1.pdf

Tsagris M.T., Preston S. and Wood A.T.A. (2011). A data-based power transformation for compositional data. In Proceedings of the 4th Compositional Data Analysis Workshop, Girona, Spain. http://arxiv.org/pdf/1106.1451.pdf

Examples

Run this code

# NOT RUN {
library(MASS)
y <- rdiri( 214, runif(4, 1, 3) )
x <- as.matrix( fgl[, 2:9] )
x <- x / rowSums(x)
mod <- klalfapcr.tune(y = y, x = x, a = c(0.7, 0.8) )
mod
# }

Run the code above in your browser using DataLab