mlogit: Multinomial logit model

Description

Estimation by maximum likelihood of the multinomial logit model, with alternative-specific and/or individual specific variables.

Usage

mlogit(
  formula,
  data,
  subset,
  weights,
  na.action,
  start = NULL,
  alt.subset = NULL,
  reflevel = NULL,
  nests = NULL,
  un.nest.el = FALSE,
  unscaled = FALSE,
  heterosc = FALSE,
  rpar = NULL,
  probit = FALSE,
  R = 40,
  correlation = FALSE,
  halton = NULL,
  random.nb = NULL,
  panel = FALSE,
  estimate = TRUE,
  seed = 10,
  ...
)

Value

An object of class `"mlogit"`, a list with elements:

- coefficients: the named vector of coefficients, - logLik: the value of the log-likelihood, - hessian: the hessian of the log-likelihood at convergence, - gradient: the gradient of the log-likelihood at convergence, - call: the matched call, - est.stat: some information about the estimation (time used, optimisation method), - freq: the frequency of choice, - residuals: the residuals, - fitted.values: the fitted values, - formula: the formula (a `Formula` object), - expanded.formula: the formula (a `formula` object), - model: the model frame used, - index: the index of the choice and of the alternatives.

Arguments

formula: a symbolic description of the model to be estimated,
data: the data: an `mlogit.data` object or an ordinary `data.frame`,
subset: an optional vector specifying a subset of observations for `mlogit`,
weights: an optional vector of weights,
na.action: a function which indicates what should happen when the data contains `NA`s,
start: a vector of starting values,
alt.subset: a vector of character strings containing the subset of alternative on which the model should be estimated,
reflevel: the base alternative (the one for which the coefficients of individual-specific variables are normalized to 0),
nests: a named list of characters vectors, each names being a nest, the corresponding vector being the set of alternatives that belong to this nest,
un.nest.el: a boolean, if `TRUE`, the hypothesis of unique elasticity is imposed for nested logit models,
unscaled: a boolean, if `TRUE`, the unscaled version of the nested logit model is estimated,
heterosc: a boolean, if `TRUE`, the heteroscedastic logit model is estimated,
rpar: a named vector whose names are the random parameters and values the distribution : `'n'` for normal, `'l'` for log-normal, `'t'` for truncated normal, `'u' ` for uniform,
probit: if `TRUE`, a multinomial porbit model is estimated,
R: the number of function evaluation for the gaussian quadrature method used if `heterosc = TRUE`, the number of draws of pseudo-random numbers if `rpar` is not `NULL`,
correlation: only relevant if `rpar` is not `NULL`, if true, the correlation between random parameters is taken into account,
halton: only relevant if `rpar` is not `NULL`, if not `NULL`, halton sequence is used instead of pseudo-random numbers. If `halton = NA`, some default values are used for the prime of the sequence (actually, the primes are used in order) and for the number of elements droped. Otherwise, `halton` should be a list with elements `prime` (the primes used) and `drop` (the number of elements droped).
random.nb: only relevant if `rpar` is not `NULL`, a user-supplied matrix of random,
panel: only relevant if `rpar` is not `NULL` and if the data are repeated observations of the same unit ; if `TRUE`, the mixed-logit model is estimated using panel techniques,
estimate: a boolean indicating whether the model should be estimated or not: if not, the `model.frame` is returned,
seed: the seed to use for random numbers (for mixed logit and probit models),
...: further arguments passed to `mlogit.data` or `mlogit.optim`.

Author

Yves Croissant

Details

For how to use the formula argument, see [Formula()].

The `data` argument may be an ordinary `data.frame`. In this case, some supplementary arguments should be provided and are passed to [mlogit.data()]. Note that it is not necessary to indicate the choice argument as it is deduced from the formula.

The model is estimated using the [mlogit.optim()]. function.

The basic multinomial logit model and three important extentions of this model may be estimated.

If `heterosc=TRUE`, the heteroscedastic logit model is estimated. `J - 1` extra coefficients are estimated that represent the scale parameter for `J - 1` alternatives, the scale parameter for the reference alternative being normalized to 1. The probabilities don't have a closed form, they are estimated using a gaussian quadrature method.

If `nests` is not `NULL`, the nested logit model is estimated.

If `rpar` is not `NULL`, the random parameter model is estimated. The probabilities are approximated using simulations with `R` draws and halton sequences are used if `halton` is not `NULL`. Pseudo-random numbers are drawns from a standard normal and the relevant transformations are performed to obtain numbers drawns from a normal, log-normal, censored-normal or uniform distribution. If `correlation = TRUE`, the correlation between the random parameters are taken into account by estimating the components of the cholesky decomposition of the covariance matrix. With G random parameters, without correlation G standard deviations are estimated, with correlation G * (G + 1) /2 coefficients are estimated.

References

MCFA:73mlogit

MCFA:74mlogit

TRAI:09mlogit

Examples

Run this code

## Cameron and Trivedi's Microeconometrics p.493 There are two
## alternative specific variables : price and catch one individual
## specific variable (income) and four fishing mode : beach, pier, boat,
## charter

data("Fishing", package = "mlogit")
Fish <- dfidx(Fishing, varying = 2:9, shape = "wide", choice = "mode")

## a pure "conditional" model
summary(mlogit(mode ~ price + catch, data = Fish))

## a pure "multinomial model"
summary(mlogit(mode ~ 0 | income, data = Fish))

## which can also be estimated using multinom (package nnet)
summary(nnet::multinom(mode ~ income, data = Fishing))

## a "mixed" model
m <- mlogit(mode ~ price + catch | income, data = Fish)
summary(m)

## same model with charter as the reference level
m <- mlogit(mode ~ price + catch | income, data = Fish, reflevel = "charter")

## same model with a subset of alternatives : charter, pier, beach
m <- mlogit(mode ~ price + catch | income, data = Fish,
            alt.subset = c("charter", "pier", "beach"))

## model on unbalanced data i.e. for some observations, some
## alternatives are missing
# a data.frame in wide format with two missing prices
Fishing2 <- Fishing
Fishing2[1, "price.pier"] <- Fishing2[3, "price.beach"] <- NA
mlogit(mode ~ price + catch | income, Fishing2, shape = "wide", varying = 2:9)

# a data.frame in long format with three missing lines
data("TravelMode", package = "AER")
Tr2 <- TravelMode[-c(2, 7, 9),]
mlogit(choice ~ wait + gcost | income + size, Tr2)

## An heteroscedastic logit model
data("TravelMode", package = "AER")
hl <- mlogit(choice ~ wait + travel + vcost, TravelMode, heterosc = TRUE)

## A nested logit model
TravelMode$avincome <- with(TravelMode, income * (mode == "air"))
TravelMode$time <- with(TravelMode, travel + wait)/60
TravelMode$timeair <- with(TravelMode, time * I(mode == "air"))
TravelMode$income <- with(TravelMode, income / 10)
# Hensher and Greene (2002), table 1 p.8-9 model 5
TravelMode$incomeother <- with(TravelMode, ifelse(mode %in% c('air', 'car'), income, 0))
nl <- mlogit(choice ~ gcost + wait + incomeother, TravelMode,
             nests = list(public = c('train', 'bus'), other = c('car','air')))
             
# same with a comon nest elasticity (model 1)
nl2 <- update(nl, un.nest.el = TRUE)

## a probit model
if (FALSE) {
pr <- mlogit(choice ~ wait + travel + vcost, TravelMode, probit = TRUE)
}

## a mixed logit model
if (FALSE) {
rpl <- mlogit(mode ~ price + catch | income, Fishing, varying = 2:9,
              rpar = c(price= 'n', catch = 'n'), correlation = TRUE,
              alton = NA, R = 50)
summary(rpl)
rpar(rpl)
cor.mlogit(rpl)
cov.mlogit(rpl)
rpar(rpl, "catch")
summary(rpar(rpl, "catch"))
}

# a ranked ordered model
data("Game", package = "mlogit")
g <- mlogit(ch ~ own | hours, Game, varying = 1:12, ranked = TRUE,
            reflevel = "PC", idnames = c("chid", "alt"))

Run the code above in your browser using DataLab