mmlt: Multivariate Conditional Transformation Models

Description

A proof-of-concept implementation of multivariate conditional transformation models

Usage

mmlt(..., formula = ~ 1, data, conditional = GAUSSIAN, theta = NULL, 
     control.outer = list(trace = FALSE), scale = FALSE, dofit = TRUE)
# S3 method for mmlt
coef(object, newdata, 
                    type = c("all", "marginal", "Lambda", "Lambdainv", 
                             "Sigma", "Corr", "Spearman"), 
                     ...)
# S3 method for mmlt
predict(object, newdata, margins = 1:J, 
                       type = c("trafo", "distribution", "density"), 
                       log = FALSE, ...)
# S3 method for mmlt
simulate(object, nsim = 1L, seed = NULL, newdata, K = 50, ...)

Value

An object of class mmlt with coef and predict

methods.

Arguments

...: marginal transformation models, one for each response, for mmlt. Additional arguments for the methods.
formula: a model formula describing a model for the dependency structure via the lambda parameters. The default is set to ~ 1 for constant lambdas.
data: a data.frame.
conditional: logical; parameters are defined conditionally (only possible when all models are probit models). This is the default as described by Klein et al. (2022). If FALSE, parameters can be directly interpreted marginally, this is explained in Section 2.6 by Klein et al. (2022). Using conditional = FALSE with probit-only models gives the same likelihood but different parameter estimates.
theta: an optional vector of starting values.
control.outer: a list controlling auglag
scale: logical; parameters are not scaled prior to optimisation by default.
dofit: logical; parameters are fitted by default, otherwise a list with log-likelihood and score function is returned.
object: an object of class mmlt.
newdata: an optional data.frame coefficients and predictions shall be computed for.
type: type of coefficient or prediction to be returned.
margins: indices defining marginal models to be evaluated. Can be single integers giving the marginal distribution of the corresponding variable, or multiple integers (currently only 1:j implemented).
log: logical; return log-probabilities or log-densities of TRUE.
nsim: number of samples to generate.
seed: optional seed for the random number generator.
K: number of grid points to generate.

Details

The function implements multivariate conditional transformation models as described by Klein et al (2020). The response is assumed absolutely continuous at the moment, discrete versions will be added later.

Below is a simple example for an unconditional bivariate distribution. See demo("undernutrition", package = "tram") for a conditional three-variate example.

References

Nadja Klein, Torsten Hothorn, Luisa Barbanti, Thomas Kneib (2020), Multivariate Conditional Transformation Models. Scandinavian Journal of Statistics, tools:::Rd_expr_doi("10.1111/sjos.12501").

Examples

Run this code


  data("cars")

  ### fit unconditional bivariate distribution of speed and distance to stop
  ## fit unconditional marginal transformation models
  m_speed <- BoxCox(speed ~ 1, data = cars, support = ss <- c(4, 25), 
                    add = c(-5, 5))
  m_dist <- BoxCox(dist ~ 1, data = cars, support = sd <- c(0, 120), 
                   add = c(-5, 5))

  ## fit multivariate unconditional transformation model
  m_speed_dist <- mmlt(m_speed, m_dist, formula = ~ 1, data = cars)

  ## log-likelihood
  logLik(m_speed_dist)
  sum(predict(m_speed_dist, newdata = cars, type = "density", log = TRUE))

  ## lambda defining the Cholesky of the precision matrix,
  ## with standard error
  coef(m_speed_dist, type = "Lambda")
  sqrt(vcov(m_speed_dist)["dist.sped.(Intercept)", 
                          "dist.sped.(Intercept)"])

  ## simpler: Wald test of independence of speed and dist (the "dist.sped.(Intercept)"
  ## coefficient)
  summary(m_speed_dist)

  ## linear correlation, ie Pearson correlation of speed and dist after
  ## transformation to bivariate normality
  (r <- coef(m_speed_dist, type = "Corr"))
  
  ## Spearman's rho (rank correlation) of speed and dist on original scale
  (rs <- coef(m_speed_dist, type = "Spearman"))

  ## evaluate joint and marginal densities (needs to be more user-friendly)
  nd <- expand.grid(c(nd_s <- mkgrid(m_speed, 100), nd_d <- mkgrid(m_dist, 100)))
  nd$d <- predict(m_speed_dist, newdata = nd, type = "density")

  ## compute marginal densities
  nd_s <- as.data.frame(nd_s)
  nd_s$d <- predict(m_speed_dist, newdata = nd_s, margins = 1L,
                    type = "density")
  nd_d <- as.data.frame(nd_d)
  nd_d$d <- predict(m_speed_dist, newdata = nd_d, margins = 2L, 
                    type = "density")

  ## plot bivariate and marginal distribution
  col1 <- rgb(.1, .1, .1, .9)
  col2 <- rgb(.1, .1, .1, .5)
  w <- c(.8, .2)
  layout(matrix(c(2, 1, 4, 3), nrow = 2), width = w, height = rev(w))
  par(mai = c(1, 1, 0, 0) * par("mai"))
  sp <- unique(nd$speed)
  di <- unique(nd$dist)
  d <- matrix(nd$d, nrow = length(sp))
  contour(sp, di, d, xlab = "Speed (in mph)", ylab = "Distance (in ft)", xlim = ss, ylim = sd,
          col = col1)
  points(cars$speed, cars$dist, pch = 19, col = col2)
  mai <- par("mai")
  par(mai = c(0, 1, 0, 1) * mai)
  plot(d ~ speed, data = nd_s, xlim = ss, type = "n", axes = FALSE, 
       xlab = "", ylab = "")
  polygon(nd_s$speed, nd_s$d, col = col2, border = FALSE)
  par(mai = c(1, 0, 1, 0) * mai)
  plot(dist ~ d, data = nd_d, ylim = sd, type = "n", axes = FALSE, 
       xlab = "", ylab = "")
  polygon(nd_d$d, nd_d$dist, col = col2, border = FALSE)

  ### NOTE: marginal densities are NOT normal, nor is the joint
  ### distribution. The non-normal shape comes from the data-driven 
  ### transformation of both variables to joint normality in this model.

Run the code above in your browser using DataLab