Learn R Programming

DirichletReg (version 0.6-2)

DirichletRegData: Prepare Compositional Data

Description

This function prepares a matrix with compositional variables for further processing in the DirichletReg package.

Usage

DR_data(Y, trafo = sqrt(.Machine$double.eps), base = 1,
    norm_tol = sqrt(.Machine$double.eps))

## S3 method for class 'DirichletRegData':
print(x, type = c("processed", "original"), \dots)

## S3 method for class 'DirichletRegData':
summary(object, \dots)

Arguments

Y
A matrix or data.frame with nonnegative values of all compositional variables (in some cases, a vector is also permissible, see Details).
trafo
Either a logical or numeric value. Transformation of variables causes the values to shrink away from extreme values of 0 and 1, see Details. If logical, it will force (TRUE) or suppress (FALSE) transfo
base
The base component to use in the reparametrized model
norm_tol
Due to numerical precision, row sums of $\mathbf{Y}$ may not be exactly equal to 1. Therefore, norm_tol is a small non-negative value (default: latex{$\sqrt{\mathtt{.Machine\$double.eps}}$}{sqrt(.Machine$d
x
A DirichletRegData object
type
Displays either the (possibly normalized or transformed) "processed" or "original" data
object
A DirichletRegData object
...
Further arguments

Value

  • The function returns a matrix object of class DirichletRegData with the following attributes:
  • attr(*, "dimnames")a list with two entries, row names (by default NULL) and column names.
  • attr(*, "Y.original")the original data
  • attr(*, "dims")number of dimensions of Y (i.e., number of columns)
  • attr(*, "dim.names")the number of components in Y
  • attr(*, "obs")number of observations of Y (i.e., number of rows)
  • attr(*, "valid_obs")number of valid observations
  • attr(*, "normalized")a logical value indicating whether the data were normalized
  • attr(*, "transformed")a logical value indicating whether the data were transformed
  • attr(*, "base")number of the variable used as the base in the reparametrized model

encoding

UTF-8

Details

Y{ Y is a matrix or data.frame containing compositional variables. If they do not sum up to 1 for all observations, normalization is forced where each row entry is divided by the row's sum (a warning will be issued that normalization was applied). In case one row-entry (or more) is NA, the whole row will be returned as NA. Beta-distributed variables can be supplied as a single vector which, however, has to have values in the interval $[0,\,1]$. The second variable will be generated (1 - Y) and a matrix consisting of the columns 1 - Y and Y will be returned. A message will be issued that a beta-distributed variable was assumed and that this assumtion needs to be checked. } trafo{ The transformation (done if trafo = TRUE) is a generalization of that proposed by Smithson and Verkuilen (2006) that transforms each component $y$ of $Y$ by computing $y^{*}=\frac{y(n-1)+\frac{1}{2}}{n}$ where $n$ is the number of observations in $Y$ (this approach is also used in the package betareg, see Cribari-Neto & Zeileis, 2010). For an arbitrary number of dimensions (or variables) $d$ the transformation is $y^{*}=\frac{y(n-1)+\frac{1}{d}}{n}$. } base{ To set the base (i.e., omitted) component of Y for the alternative (mean/precision) model, the argument base can be used. This is by default set to the first variable in Y (if a vector is be supplied, the column 1 - Y becomes the base component). Note that the definition can be overruled in DirichReg. } x and object{ Objects created by DR_data. } type{ specifies for the print method whether the original or processed data are displayed. }

References

Smithson, M. & Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression With Beta-Distributed Dependent Variables. Psychological Methods, 11(1), 54--71. Cribari-Neto, F. & Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1--24.

Examples

Run this code
# create a DirichletRegData object from the Arctic Lake data
head(ArcticLake[, 1:3])
AL <- DR_data(ArcticLake[, 1:3])
summary(AL)
head(AL)

Run the code above in your browser using DataLab