Learn R Programming

npsf (version 0.8.0)

teradialbc: Statistical Inference Regarding the Radial Measure of Technical Efficiency

Description

Routine teradialbc performs bias correction of the radial Debrue-Farrell input- or output-based measure of technical efficiency, computes bias and constructs confidence intervals via bootstrapping techniques.

Usage

teradialbc(formula, data, subset,
 ref = NULL, data.ref = NULL, subset.ref = NULL,
 rts = c("C", "NI", "V"), base = c("output", "input"),
 homogeneous = TRUE, smoothed = TRUE, kappa = NULL,
 reps = 999, level = 95,
 core.count = 1, cl.type = c("SOCK", "MPI"),
 print.level = 1, dots = TRUE)

Arguments

formula

an object of class ``formula'' (or one that can be coerced to that class): a symbolic description of the model. The details of model specification are given under `Details'.

data

an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment (formula), typically the environment from which teradial is called.

subset

an optional vector specifying a subset of observations for which technical efficiency is to be computed.

rts

character or numeric. string: first letter of the word ``c'' for constant, ``n'' for non-increasing, or ``v'' for variable returns to scale assumption. numeric: 3 for constant, 2 for non-increasing, or 1 for variable returns to scale assumption.

base

character or numeric. string: first letter of the word ``o'' for computing output-based or ``i'' for computing input-based technical efficiency measure. string: 2 for computing output-based or 1 for computing input-based technical efficiency measure

ref

an object of class ``formula'' (or one that can be coerced to that class): a symbolic description of inputs and outputs that are used to define the technology reference set. The details of technology reference set specification are given under `Details'. If reference is not provided, the technical efficiency measures for data points are computed relative to technology based on data points themselves.

data.ref

an optional data frame containing the variables in the technology reference set. If not found in data.ref, the variables are taken from environment(ref), typically the environment from which teradial is called.

subset.ref

an optional vector specifying a subset of observations to define the technology reference set.

smoothed

logical. If TRUE, the reference set is bootstrapped with smoothing; if FALSE, the reference set is bootstrapped with subsampling.

homogeneous

logical. Relevant if smoothed=TRUE. If TRUE, the reference set is bootstrapped with homogeneous smoothing; if FALSE, the reference set is bootstrapped with heterogeneous smoothing.

kappa

relevant if smoothed=TRUE. 'kappa' sets the size of the subsample as K^kappa, where K is the number of data points in the original reference set. The default value is 0.7. 'kappa' may be between 0.5 and 1.

reps

specifies the number of bootstrap replications to be performed. The default is 999. The minimum is 100. Adequate estimates of confidence intervals using bias-corrected methods typically require 1,000 or more replications.

level

sets confidence level for confidence intervals; default is level = 95.

core.count

positive integer. Number of cluster nodes. If core.count=1, the process runs sequentially. See performParallel in package snowFT for more details.

cl.type

Character string that specifies cluster type (see makeClusterFT in package snowFT). Possible values are 'MPI' and 'SOCK' ('PVM' is currently not available). See performParallel in package snowFT for more details.

dots

logical. Relevant if print.level>=1. If TRUE, one dot character is displayed for each successful replication; if FALSE, display of the replication dots is suppressed.

print.level

numeric. 0 - nothing is printed; 1 - print summary of the model and data. 2 - print summary of technical efficiency measures. 3 - print estimation results observation by observation. Default is 1.

Value

teradialbc returns a list of class npsf containing the following elements:

K

numeric: number of data points.

M

numeric: number of outputs.

N

numeric: number of inputs.

rts

string: RTS assumption.

base

string: base for efficiency measurement.

reps

numeric: number of bootstrap replications.

level

numeric: confidence level for confidence intervals.

te

numeric: radial measure (Russell) of technical efficiency.

tebc

numeric: bias-corrected radial measures of technical efficiency.

biasboot

numeric: bootstrap bias estimate for the original radial measures of technical efficiency.

varboot

numeric: bootstrap variance estimate for the radial measures of technical efficiency.

biassqvar

numeric: one-third of the ratio of bias squared to variance for radial measures of technical efficiency.

realreps

numeric: actual number of replications used for statistical inference.

telow

numeric: lower bound estimate for radial measures of technical efficiency.

teupp

numeric: upper bound estimate for radial measures of technical efficiency.

teboot

numeric: reps x K matrix containing bootstrapped measures of technical efficiency from each of reps bootstrap replications.

esample

logical: returns TRUE if the observation in user supplied data is in the estimation subsample and FALSE otherwise.

Details

Routine teradialbc performs bias correction of the radial Debrue-Farrell input- or output-based measure of technical efficiency, computes bias and constructs confidence intervals via bootstrapping techniques. See Simar and Wilson (1998) 10.1287/mnsc.44.1.49, Simar and Wilson (2000) 10.1080/02664760050081951, Kneip, Simar, and Wilson (2008) 10.1017/S0266466608080651, and references with links below.

Models for teradialbc are specified symbolically. A typical model has the form outputs ~ inputs, where outputs (inputs) is a series of (numeric) terms which specifies outputs (inputs). The same goes for reference set. Refer to the examples.

If core.count>=1, teradialbc will perform bootstrap on multiple cores. Parallel computing requires package snowFT. By the default cluster type is defined by option cl.type="SOCK". Specifying cl.type="MPI" requires package Rmpi.

On some systems, specifying option cl.type="SOCK" results in much quicker execution than specifying option cl.type="MPI". Option cl.type="SOCK" might be problematic on Mac system.

Parallel computing make a difference for large data sets. Specifying option dots=TRUE will indicate at what speed the bootstrap actually proceeds. Specify reps=100 and compare two runs with option core.count=1 and core.count>1 to see if parallel computing speeds up the bootstrap. For small samples, parallel computing may actually slow down the teradialbc.

Results can be summarized using summary.npsf.

References

Badunenko, O. and Mozharovskyi, P. (2016), Nonparametric Frontier Analysis using Stata, Stata Journal, 163, 550--89, 10.1177/1536867X1601600302

F<U+00E4>re, R. and Lovell, C. A. K. (1978), Measuring the technical efficiency of production, Journal of Economic Theory, 19, 150--162, 10.1016/0022-0531(78)90060-1

F<U+00E4>re, R., Grosskopf, S. and Lovell, C. A. K. (1994), Production Frontiers, Cambridge U.K.: Cambridge University Press, 10.1017/CBO9780511551710

Kneip, A., Simar L., and P.W. Wilson (2008), Asymptotics and Consistent Bootstraps for DEA Estimators in Nonparametric Frontier Models, Econometric Theory, 24, 1663--1697, 10.1017/S0266466608080651

Simar, L. and P.W. Wilson (1998), Sensitivity Analysis of Efficiency Scores: How to Bootstrap in Nonparametric Frontier Models, Management Science, 44, 49--61, 10.1287/mnsc.44.1.49

Simar, L. and P.W. Wilson (2000), A General Methodology for Bootstrapping in Nonparametric Frontier Models, Journal of Applied Statistics, 27, 779--802, 10.1080/02664760050081951

See Also

teradial, tenonradial, tenonradialbc, nptestrts, nptestind, sf

Examples

Run this code
# NOT RUN {
# }
# NOT RUN {
require( npsf )

# Prepare data and matrices

data( pwt56 )
head( pwt56 )

# Create some missing values

pwt56 [49, "K"] <- NA # just to create missing

Y1 <- as.matrix ( pwt56[ pwt56$year == 1965, c("Y"), drop = FALSE] )
X1 <- as.matrix ( pwt56[ pwt56$year == 1965, c("K", "L"), drop = FALSE] )

X1 [51, 2] <- NA # just to create missing
X1 [49, 1] <- NA # just to create missing

data( ccr81 )
head( ccr81 )

# Create some missing values

ccr81 [64, "x4"] <- NA # just to create missing
ccr81 [68, "y2"] <- NA # just to create missing

Y2 <- as.matrix( ccr81[ , c("y1", "y2", "y3"), drop = FALSE] )
X2 <- as.matrix( ccr81[ , c("x1", "x2", "x3", "x4", "x5"), drop = FALSE] )

# Compute output-based measures of technical efficiency under 
# the assumption of CRS (the default) and perform bias-correctiion
# using smoothed homogeneous bootstrap (the default) with 999
# replications (the default).

t1 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81)

# or just

t2 <- teradialbc ( Y2 ~ X2)

# Combined formula and matrix

t3 <- teradialbc ( Y ~ K + L, data = pwt56, subset = Nu < 10, 
	ref = Y1[-2,] ~ X1[-1,] )

# Compute input-based measures of technical efficiency under 
# the assumption of VRS and perform bias-correctiion using
# subsampling heterogenous bootstrap with 1999 replications.
# Choose to report 99<!-- % confidence interval. The reference set -->
# formed by data points where x5 is not equal 10. 
# Suppress printing dots.

t4 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, reps = 1999, 
	smoothed = FALSE, kappa = 0.7, dots = FALSE, 
	base = "i", rts = "v", level = 99)

# Compute input-based measures of technical efficiency under
# the assumption of NRS and perform bias-correctiion using 
# smoothed heterogenous bootstrap with 499 replications for 
# all data points. The reference set formed by data points 
# where x5 is not equal 10.

t5 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, homogeneous = FALSE, 
	reps = 999, smoothed = TRUE, dots = TRUE, base = "i", rts = "n")


# ===========================
# ===  Parallel computing ===
# ===========================

# Perform previous bias-correction but use 8 cores and 
# cluster type SOCK

t51 <-  teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, homogeneous = FALSE, 
	reps = 999, smoothed = TRUE, dots = TRUE, base = "i", rts = "n", 
	core.count = 8, cl.type = "SOCK")


# Really large data-set

data(usmanuf)
head(usmanuf)

nrow(usmanuf)
table(usmanuf$year)

# This will take some time depending on computer power

data(usmanuf)
head(usmanuf)

t6 <- teradialbc ( Y ~ K + L + M, data = usmanuf, 
	subset = year >= 1999 & year <= 2000, homogeneous = FALSE, 
	base = "o", reps = 100, 
	core.count = 8, cl.type = "SOCK")

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab