
Maximum Likelihood Estimation of Stochastic Frontier Production and Cost Functions. Two specifications are available: the error components specification with time-varying efficiencies (Battese and Coelli 1992) and a model specification in which the firm effects are directly influenced by a number of variables (Battese and Coelli 1995). This R package uses the Fortran source code of Frontier 4.1 (Coelli 1996).
sfa( formula, data = sys.frame( sys.parent() ),
ineffDecrease = TRUE, truncNorm = FALSE,
timeEffect = FALSE, startVal = NULL,
tol = 0.00001, maxit = 1000, muBound = 2, bignum = 1.0E+16,
searchStep = 0.00001, searchTol = 0.001, searchScale = NA,
gridSize = 0.1, gridDouble = TRUE,
restartMax = 10, restartFactor = 0.999, printIter = 0 )frontier( yName, xNames = NULL, zNames = NULL, data,
zIntercept = FALSE, … )
# S3 method for frontier
print( x, digits = NULL, … )
a symbolic description of the model to be estimated; it can be either a (usual) one-part or a two-part formula (see section ‘Details’).
a (panel) data frame that contains the data;
if data
is a usual data.frame,
it is assumed that these are cross-section data;
if data
is a panel data frame
(created with pdata.frame
),
it is assumed that these are panel data.
logical. If TRUE
,
inefficiency decreases the endogenous variable
(e.g. for estimating a production function);
if FALSE
,
inefficiency increases the endogenous variable
(e.g. for estimating a cost function).
logical. If TRUE
,
the inefficiencies are assumed to have a truncated normal distribution
(i.e. parameter FALSE
,
they are assumed to have a half-normal distribution
(only relevant for the ‘Error Components Frontier’).
logical. If FALSE
(default),
the efficiency estimates of an ‘Error Components Frontier’
are time invariant;
if TRUE
, time is allowed to have an effect on efficiency
(this argument is ignored in case of an
‘Efficiency Effects Frontier’).
numeric vector. Optional starting values for the ML estimation.
numeric. Convergence tolerance (proportional).
numeric. Maximum number of iterations permitted.
numeric. Bounds on the parameter
numeric. Used to set bounds on densities and distributions.
numeric. Size of the first step in the Coggin uni-dimensional search procedure done each iteration to determine the optimal step length for the next iteration (see Himmelblau 1972).
numeric. Tolerance used in the Coggin uni-dimensional search procedure done each iteration to determine the optimal step length for the next iteration (see Himmelblau 1972).
logical or NA
. Scaling in the Coggin
uni-dimensional search procedure done each iteration
to determine the optimal step length for the next iteration
(see Himmelblau 1972):
if TRUE
, the step length is scaled to the length of the last step;
if FALSE
, the step length is not scaled;
if NA
, the step length is scaled (to the length of last step)
only if the last step was smaller.
numeric. The size of the increment
in the first phase grid search on
logical. If TRUE
,
a second phase grid search on gridSize/10
.
integer: maximum number of restarts of the search procedure when it cannot find a parameter vector that results in a log-likelihood value larger than the log-likelihood value of the initial parameters.
numeric scalar: if the search procedure
cannot find a parameter vector that results in a log-likelihood value
larger than the log-likelihood value of the initial parameters,
the initial values
(provided by argument startVal
or obtained by the grid serach)
are multiplied by this number before the search procedure
is restarted.
numeric. Print info every printIter
iterations;
if this argument is 0, do not print.
string: name of the endogenous variable.
a vector of strings containing the names of the X variables (exogenous variables of the production or cost function).
a vector of strings containing the names of the Z variables (variables explaining the efficiency level).
logical. If TRUE
,
an intercept (with parameter
an object of class frontier
(returned by the function frontier
).
a non-null value for ‘digits’ specifies
the minimum number of significant digits to be printed in values.
The default, NULL
, uses
max(3,getOption("digits")-3)
.
Non-integer values will be rounded down, and only values greater
than or equal to 1 and no greater than 22 are accepted.
additional arguments of frontier
are passed to sfa
;
additional arguments of the print
method
are currently ignored.
sfa
and frontier
return a list of class frontier
containing following elements:
integer. A ‘1’ denotes an ‘Error Components Frontier’ (ECF); a ‘2’ denotes an ‘Efficiency Effects Frontier’ (EFF).
logical. Argument ineffDecrease
(see above).
number of cross-sections.
number of time periods.
number of observations in total.
number of regressor variables (Xs).
logical. Argument truncNorm
.
logical. Argument zIntercept
.
logical. Argument timeEffect
.
numeric. Argument printIter
(see above).
numeric. Argument searchScale
(see above).
numeric. Argument tol
(see above).
numeric. Argument searchTol
(see above).
numeric. Argument bignum
(see above).
numeric. Argument searchStep
(see above).
logical. Argument gridDouble
(see above).
numeric. Argument gridSize
(see above).
numeric. Argument maxit
(see above).
numeric. Argument muBound
(see above).
numeric. Argument restartMax
(see above).
numeric. Argument restartFactor
(see above).
numeric. Number of restarts of the search procedure when it cannot find a parameter vector that results in a log-likelihood value larger than the log-likelihood value of the initial parameters.
numeric vector. Argument startVal
(only if specified by user).
the matched call.
matrix. Data matrix sent to Frontier 4.1.
numeric vector. OLS estimates.
numeric vector. Standard errors of OLS estimates.
numeric. Log likelihood value of OLS estimation.
numeric vector. Residuals of the OLS estimation.
numeric. Skewness of the residuals of the OLS estimation.
logical. Indicating if the residuals of the OLS estimation have the expected skewness.
numeric vector. Parameters obtained from the grid search (if no starting values were specified).
numeric. Log likelihood value of the parameters obtained from the grid search (only if no starting values were specified).
numeric. Log likelihood value of the starting values for the parameters (only if starting values were specified).
numeric vector. Parameters obtained from ML estimation.
matrix. Covariance matrix of the parameters obtained from the OLS estimation.
numeric. Log likelihood value of the ML estimation.
numeric. Number of iterations of the ML estimation.
integer indication the reason for determination:
1
= log likelihood values and parameters
of two successive iterations are within the tolerance limits;
5
= cannot find a parameter vector that results
in a log-likelihood value larger than
the log-likelihood value obtained in the previous step;
6
= search failed on gradient step;
10
= maximum number of iterations reached.
Number of evaluations of the log likelihood function during the grid search and the iterative ML estimation.
matrix. Fitted “frontier” values of the dependent variable: each row corresponds to a cross-section; each column corresponds to a time period.
matrix. Residuals: each row corresponds to a cross-section; each column corresponds to a time period.
vector of logical values indicating which observations
of the provided data were used for the estimation,
i.e. do not have values that are not available (NA
, NaN
)
or infinite (Inf
).
Function frontier
is a wrapper function
that calls sfa
for the estimation.
The two functions differ only in the user interface;
function frontier
has the “old” user interface
and is kept to maintain compatibility with older versions
of the frontier
package.
One can use functions sfa
and frontier
to calculate the log likelihood value for a given model,
a given data set, and given parameters
by using the argument startVal
to specify the parameters
and using the other arguments to specify the model and the data.
The log likelihood value can then be retrieved by
the logLik
method
with argument which
set to "start"
.
Setting argument maxit
to 0
avoids the
(eventually time-consuming) ML estimation and allows
to retrieve the log likelihood value
with the logLik
method
without further arguments.
The frontier
function uses the Fortran source code of
Tim Coelli's software FRONTIER 4.1
(http://www.uq.edu.au/economics/cepa/frontier.htm)
and hence, provides the same features as FRONTIER 4.1.
A comprehensive documentation of FRONTIER 4.1 is available
in the file Front41.pdf
that is included in the archive FRONT41-xp1.zip
,
which is available at
http://www.uq.edu.au/economics/cepa/frontier.htm.
It is recommended to read this documentation,
because the frontier
function is based on the FRONTIER 4.1 software.
If argument formula
of sfa
is a (usual) one-part formula
(or argument zNames
of frontier
is NULL
),
an ‘Error Components Frontier’ (ECF, see Battese and Coelli 1992)
is estimated.
If argument formula
is a two-part formula
(or zNames
is not NULL
),
an ‘Efficiency Effects Frontier’ (EEF, see Battese and Coelli 1995)
is estimated.
In this case, the first part of the formula
(i.e. the part before the “|” symbol)
is used to explain the endogenous variable directly (X variables),
while the second part of the formula
(i.e. the part after the “|” symbol)
is used to explain the efficiency levels (Z variables).
Generally, there should be no reason for estimating an EEF
without Z variables,
but this can done by setting the second part of argument formula
to 1
(with Z intercept) or - 1
(without Z intercept)
(or by setting argument zNames
) to NA
).
In case of an Error Components Frontier (ECF)
with the inefficiency terms muBound
can be used to restrict muBound
* muBound
is infinity, zero, or negative,
no bounds on
Battese, G.E. and T. Coelli (1992), Frontier production functions, technical efficiency and panel data: with application to paddy farmers in India. Journal of Productivity Analysis, 3, 153-169.
Battese, G.E. and T. Coelli (1995), A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empirical Economics, 20, 325-332.
Coelli, T. (1996) A Guide to FRONTIER Version 4.1: A Computer Program for Stochastic Frontier Production and Cost Function Estimation, CEPA Working Paper 96/08, http://www.uq.edu.au/economics/cepa/frontier.php, University of New England.
Himmelblau, D.M. (1972), Applied Non-Linear Programming, McGraw-Hill, New York.
frontierQuad
for quadratic/translog frontiers,
summary.frontier
for creating and printing summary results,
efficiencies.frontier
for calculating efficiency estimates,
lrtest.frontier
for comparing models by LR tests,
fitted.frontier
for obtaining the fitted “frontier” values,
ang residuals.frontier
for obtaining the residuals.
# NOT RUN {
# example included in FRONTIER 4.1 (cross-section data)
data( front41Data )
# Cobb-Douglas production frontier
cobbDouglas <- sfa( log( output ) ~ log( capital ) + log( labour ),
data = front41Data )
summary( cobbDouglas )
# load data about rice producers in the Philippines (panel data)
data( riceProdPhil )
# Error Components Frontier (Battese & Coelli 1992)
# with observation-specific efficiencies (ignoring the panel structure)
rice <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ),
data = riceProdPhil )
summary( rice )
# Error Components Frontier (Battese & Coelli 1992)
# with "true" fixed individual effects and observation-specific efficiencies
riceTrue <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) +
factor( FMERCODE ), data = riceProdPhil )
summary( riceTrue )
# add data set with information about its panel structure
library( "plm" )
ricePanel <- pdata.frame( riceProdPhil, c( "FMERCODE", "YEARDUM" ) )
# Error Components Frontier (Battese & Coelli 1992)
# with time-invariant efficiencies
riceTimeInv <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ),
data = ricePanel )
summary( riceTimeInv )
# Error Components Frontier (Battese & Coelli 1992)
# with time-variant efficiencies
riceTimeVar <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ),
data = ricePanel, timeEffect = TRUE )
summary( riceTimeVar )
# Technical Efficiency Effects Frontier (Battese & Coelli 1995)
# (efficiency effects model with intercept)
riceZ <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) |
EDYRS + BANRAT, data = riceProdPhil )
summary( riceZ )
# Technical Efficiency Effects Frontier (Battese & Coelli 1995)
# (efficiency effects model without intercept)
riceZ2 <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) |
EDYRS + BANRAT - 1, data = riceProdPhil )
summary( riceZ2 )
# Cost Frontier (with land as quasi-fixed input)
riceProdPhil$cost <- riceProdPhil$LABOR * riceProdPhil$LABORP +
riceProdPhil$NPK * riceProdPhil$NPKP
riceCost <- sfa( log( cost ) ~ log( PROD ) + log( AREA ) + log( LABORP )
+ log( NPKP ), data = riceProdPhil, ineffDecrease = FALSE )
summary( riceCost )
# }
Run the code above in your browser using DataLab