Learn R Programming

ROptEst (version 1.3.4)

RMXEOMSEMBREOBRE: Optimally robust estimation: RMXE, OMSE, MBRE, and OBRE

Description

These are wrapper functions to 'roptest' to compute optimally robust estimates, more specifically RMXEs, OMSEs, MBREs, and OBREs, for L2-differentiable parametric families via k-step construction.

Usage

RMXEstimator(x, L2Fam, fsCor = 1, initial.est, neighbor = ContNeighborhood(),
             steps = 1L, distance = CvMDist, startPar = NULL, verbose = NULL,
             OptOrIter = "iterate", useLast = getRobAStBaseOption("kStepUseLast"),
             withUpdateInKer = getRobAStBaseOption("withUpdateInKer"),
             IC.UpdateInKer = getRobAStBaseOption("IC.UpdateInKer"),
             withICList = getRobAStBaseOption("withICList"),
             withPICList = getRobAStBaseOption("withPICList"), na.rm = TRUE,
             initial.est.ArgList, ..., withLogScale = TRUE, ..withCheck=FALSE,
             withTimings = FALSE, withMDE = NULL, withEvalAsVar = NULL,
             withMakeIC = FALSE, modifyICwarn = NULL, E.argList = NULL,
             diagnostic = FALSE)
OMSEstimator(x, L2Fam, eps=0.5, fsCor = 1, initial.est, neighbor = ContNeighborhood(),
             steps = 1L, distance = CvMDist, startPar = NULL, verbose = NULL,
             OptOrIter = "iterate", useLast = getRobAStBaseOption("kStepUseLast"),
             withUpdateInKer = getRobAStBaseOption("withUpdateInKer"),
             IC.UpdateInKer = getRobAStBaseOption("IC.UpdateInKer"),
             withICList = getRobAStBaseOption("withICList"),
             withPICList = getRobAStBaseOption("withPICList"), na.rm = TRUE,
             initial.est.ArgList, ..., withLogScale = TRUE, ..withCheck=FALSE,
             withTimings = FALSE, withMDE = NULL, withEvalAsVar = NULL,
             withMakeIC = FALSE, modifyICwarn = NULL, E.argList = NULL,
             diagnostic = FALSE)
OBREstimator(x, L2Fam, eff=0.95, fsCor = 1, initial.est, neighbor = ContNeighborhood(),
             steps = 1L, distance = CvMDist, startPar = NULL, verbose = NULL,
             OptOrIter = "iterate", useLast = getRobAStBaseOption("kStepUseLast"),
             withUpdateInKer = getRobAStBaseOption("withUpdateInKer"),
             IC.UpdateInKer = getRobAStBaseOption("IC.UpdateInKer"),
             withICList = getRobAStBaseOption("withICList"),
             withPICList = getRobAStBaseOption("withPICList"), na.rm = TRUE,
             initial.est.ArgList, ..., withLogScale = TRUE, ..withCheck=FALSE,
             withTimings = FALSE, withMDE = NULL, withEvalAsVar = NULL,
             withMakeIC = FALSE, modifyICwarn = NULL, E.argList = NULL,
             diagnostic = FALSE)
MBREstimator(x, L2Fam, fsCor = 1, initial.est, neighbor = ContNeighborhood(),
             steps = 1L, distance = CvMDist, startPar = NULL, verbose = NULL,
             OptOrIter = "iterate", useLast = getRobAStBaseOption("kStepUseLast"),
             withUpdateInKer = getRobAStBaseOption("withUpdateInKer"),
             IC.UpdateInKer = getRobAStBaseOption("IC.UpdateInKer"),
             withICList = getRobAStBaseOption("withICList"),
             withPICList = getRobAStBaseOption("withPICList"), na.rm = TRUE,
             initial.est.ArgList, ..., withLogScale = TRUE, ..withCheck=FALSE,
             withTimings = FALSE, withMDE = NULL, withEvalAsVar = NULL,
             withMakeIC = FALSE, modifyICwarn = NULL, E.argList = NULL,
             diagnostic = FALSE)

Value

Object of class "kStepEstimate". In addition, it has an attribute "timings" where computation time is stored.

Arguments

x

sample

L2Fam

object of class "L2ParamFamily"

eff

positive real (0 <= eff <= 1): amount of asymptotic efficiency loss in the ideal model. See details below.

eps

positive real (0 < eps <= 0.5): amount of gross errors. See details below.

fsCor

positive real: factor used to correct the neighborhood radius; see details.

initial.est

initial estimate for unknown parameter. If missing minimum distance estimator is computed.

neighbor

object of class "UncondNeighborhood"

steps

positive integer: number of steps used for k-steps construction

distance

distance function used in MDEstimator, which in turn is used as (default) starting estimator.

startPar

initial information used by optimize resp. optim; i.e; if (total) parameter is of length 1, startPar is a search interval, else it is an initial parameter value; if NULL slot startPar of ParamFamily is used to produce it; in the multivariate case, startPar may also be of class Estimate, in which case slot untransformed.estimate is used.

verbose

logical: if TRUE, some messages are printed

useLast

which parameter estimate (initial estimate or k-step estimate) shall be used to fill the slots pIC, asvar and asbias of the return value.

OptOrIter

character; which method to be used for determining Lagrange multipliers A and a: if (partially) matched to "optimize", getLagrangeMultByOptim is used; otherwise: by default, or if matched to "iterate" or to "doubleiterate", getLagrangeMultByIter is used. More specifically, when using getLagrangeMultByIter, and if argument risk is of class "asGRisk", by default and if matched to "iterate" we use only one (inner) iteration, if matched to "doubleiterate" we use up to Maxiter (inner) iterations.

withUpdateInKer

if there is a non-trivial trafo in the model with matrix \(D\), shall the parameter be updated on \({\rm ker}(D)\)?

IC.UpdateInKer

if there is a non-trivial trafo in the model with matrix \(D\), the IC to be used for this; if NULL the result of getboundedIC(L2Fam,D) is taken; this IC will then be projected onto \({\rm ker}(D)\).

withPICList

logical: shall slot pICList of return value be filled?

withICList

logical: shall slot ICList of return value be filled?

na.rm

logical: if TRUE, the estimator is evaluated at complete.cases(x).

initial.est.ArgList

a list of arguments to be given to argument start if the latter is a function; this list by default already starts with two unnamed items, the sample x, and the model L2Fam.

...

further arguments

withLogScale

logical; shall a scale component (if existing and found with name scalename) be computed on log-scale and backtransformed afterwards? This avoids crossing 0.

..withCheck

logical: if TRUE, debugging info is issued.

withTimings

logical: if TRUE, separate (and aggregate) timings for the three steps evaluating the starting value, finding the starting influence curve, and evaluating the k-step estimator is issued.

withMDE

logical or NULL: Shall a minimum distance estimator be used as starting estimator---in addition to the function given in slot startPar of the L2 family? If NULL (default), the content of slot .withMDE in the L2 family is used instead to take this decision.

withEvalAsVar

logical or NULL: if TRUE (default), tells R to evaluate the asymptotic variance or if FALSE just to produces a call to do so. If withEvalAsVar is NULL (default), the content of slot .withEvalAsVar in the L2 family is used instead to take this decision.

withMakeIC

logical; if TRUE the [p]IC is passed through makeIC before return.

modifyICwarn

logical: should a (warning) information be added if modifyIC is applied and hence some optimality information could no longer be valid? Defaults to NULL in which case this value is taken from RobAStBaseOptions.

E.argList

NULL (default) or a list of arguments to be passed to calls to E from (a) MDEstimator (here this additional argument is only used if initial.est is missing), (b) getStartIC, and (c) kStepEstimator. Potential clashes with arguments of the same name in ... are resolved by inserting the items of argument list E.argList as named items, so in case of collisions the item of E.argList overwrites the existing one from ....

diagnostic

logical; if TRUE, diagnostic information on the performed integrations is gathered and shipped out as an attribute diagnostic of the return value of the estimators.

Author

Matthias Kohl Matthias.Kohl@stamats.de,
Peter Ruckdeschel peter.ruckdeschel@uni-oldenburg.de

Details

The functions compute optimally robust estimator for a given L2 differentiable parametric family; more specifically they are RMXEs, OMSEs, MBREs, and OBREs. The computation uses a k-step construction with an appropriate initial estimate; cf. also kStepEstimator. Valid candidates are e.g. Kolmogorov(-Smirnov) or von Mises minimum distance estimators (default); cf. Rieder (1994) and Kohl (2005).

For OMSE, i.e., the asymptotically linear estimator with minimax mean squared error on this neighborhood of given size, the amount of gross errors (contamination) is assumed to be known, and is specified by eps. The radius of the corresponding infinitesimal contamination neighborhood is obtained by multiplying eps by the square root of the sample size.

If the amount of gross errors (contamination) is unknown, RMXE should be used, i.e., the radius-minimax estimator in the sense of Rieder et al. (2001, 2008), respectively Section 2.2 of Kohl (2005) is returned.

The OBRE, i.e., the optimal bias-robust (asymptotically linear) estimator; (terminology due to Hampel et al (1985)), expects an efficiency loss (at the ideal model) to be specified and then, according to an (asymptotic) Anscombe criterion computes the the bias bound achieving this efficiency loss.

The MBRE, i.e., the most bias-robust (asymptotically linear) estimator; (terminology due to Hampel et al (1985)), uses the influence curve with minimal possible bias bound, hence minimaxes bias on these neighborhoods (in an infinitesimal sense)..

Finite-sample and higher order results suggest that the asymptotically optimal procedure is to liberal. Using fsCor the radius can be modified - as a rule enlarged - to obtain a more conservative estimate. In case of normal location and scale there is function finiteSampleCorrection which returns a finite-sample corrected (enlarged) radius based on the results of large Monte-Carlo studies.

The default value of argument useLast is set by the global option kStepUseLast which by default is set to FALSE. In case of general models useLast remains unchanged during the computations. However, if slot CallL2Fam of IC generates an object of class "L2GroupParamFamily" the value of useLast is changed to TRUE. Explicitly setting useLast to TRUE should be done with care as in this situation the influence curve is re-computed using the value of the one-step estimate which may take quite a long time depending on the model.

If useLast is set to TRUE the computation of asvar, asbias and IC is based on the k-step estimate.

All these estimators are realized as wrappers to function roptest.

Timings for the steps run through in these estimators are available in attributes timings, and for the step of the kStepEstimator in kStepTimings.

One may also use the arguments startCtrl, startICCtrl, and kStepCtrl of function robest. This allows for individual settings of E.argList, withEvalAsVar, and withMakeIC for the different steps. If any of the three arguments startCtrl, startICCtrl, and kStepCtrl is used, the respective attributes set in the correspondig argument are used and, if colliding with arguments directly passed to the estimator function, the directly passed ones are ignored.

Diagnostics on the involved integrations are available if argument diagnostic is TRUE. Then there are attributes diagnostic and kStepDiagnostic attached to the return value, which may be inspected and assessed through showDiagnostic and getDiagnostic.

References

Kohl, M. (2005) Numerical Contributions to the Asymptotic Theory of Robustness. Bayreuth: Dissertation. https://epub.uni-bayreuth.de/id/eprint/839/2/DissMKohl.pdf.

Kohl, M. and Ruckdeschel, P. (2010): R package distrMod: Object-Oriented Implementation of Probability Models. J. Statist. Softw. 35(10), 1--27. tools:::Rd_expr_doi("10.18637/jss.v035.i10").

Kohl, M. and Ruckdeschel, P., and Rieder, H. (2010): Infinitesimally Robust Estimation in General Smoothly Parametrized Models. Stat. Methods Appl., 19, 333--354. tools:::Rd_expr_doi("10.1007/s10260-010-0133-0").

Rieder, H. (1994) Robust Asymptotic Statistics. New York: Springer. tools:::Rd_expr_doi("10.1007/978-1-4684-0624-5").

Rieder, H., Kohl, M. and Ruckdeschel, P. (2008) The Costs of not Knowing the Radius. Statistical Methods and Applications 17(1) 13-40. tools:::Rd_expr_doi("10.1007/s10260-007-0047-7").

Rieder, H., Kohl, M. and Ruckdeschel, P. (2001) The Costs of not Knowing the Radius. Appeared as discussion paper Nr. 81. SFB 373 (Quantification and Simulation of Economic Processes), Humboldt University, Berlin; also available under tools:::Rd_expr_doi("10.18452/3638")

See Also

roptest, robest, roblox, L2ParamFamily-class UncondNeighborhood-class, RiskType-class

Examples

Run this code
#############################
## 1. Binomial data
#############################
## generate a sample of contaminated data
set.seed(123)
ind <- rbinom(100, size=1, prob=0.05)
x <- rbinom(100, size=25, prob=(1-ind)*0.25 + ind*0.9)

## ML-estimate
MLE.bin <- MLEstimator(x, BinomFamily(size = 25))
## compute optimally robust estimators
OMSE.bin <- OMSEstimator(x, BinomFamily(size = 25), steps = 3)
MBRE.bin <- MBREstimator(x, BinomFamily(size = 25), steps = 3)
estimate(MLE.bin)
estimate(MBRE.bin)
estimate(OMSE.bin)

  ## to reduce time load at CRAN tests
RMXE.bin <- RMXEstimator(x, BinomFamily(size = 25), steps = 3)
OBRE.bin <- OBREstimator(x, BinomFamily(size = 25), steps = 3)
estimate(RMXE.bin)
estimate(OBRE.bin)

  ## to reduce time load at CRAN tests
#############################
## 2. Poisson data
#############################

## Example: Rutherford-Geiger (1910); cf. Feller~(1968), Section VI.7 (a)
x <- c(rep(0, 57), rep(1, 203), rep(2, 383), rep(3, 525), rep(4, 532),
       rep(5, 408), rep(6, 273), rep(7, 139), rep(8, 45), rep(9, 27),
       rep(10, 10), rep(11, 4), rep(12, 0), rep(13, 1), rep(14, 1))

## ML-estimate
MLE.pois <- MLEstimator(x, PoisFamily())
OBRE.pois <- OBREstimator(x, PoisFamily(), steps = 3)
OMSE.pois <- OMSEstimator(x, PoisFamily(), steps = 3)
MBRE.pois <- MBREstimator(x, PoisFamily(), steps = 3)
RMXE.pois <- RMXEstimator(x, PoisFamily(), steps = 3)
estimate(MLE.pois)
estimate(OBRE.pois)
estimate(RMXE.pois)
estimate(MBRE.pois)
estimate(OMSE.pois)


 ## to reduce time load at CRAN tests
#############################
## 3. Normal (Gaussian) location and scale
#############################
## 24 determinations of copper in wholemeal flour
library(MASS)
data(chem)

MLE.n <- MLEstimator(chem, NormLocationScaleFamily())
MBRE.n <- MBREstimator(chem, NormLocationScaleFamily(), steps = 3)
OMSE.n <- OMSEstimator(chem, NormLocationScaleFamily(), steps = 3)
OBRE.n <- OBREstimator(chem, NormLocationScaleFamily(), steps = 3)
RMXE.n <- RMXEstimator(chem, NormLocationScaleFamily(), steps = 3)

estimate(MLE.n)
estimate(MBRE.n)
estimate(OMSE.n)
estimate(OBRE.n)
estimate(RMXE.n)

Run the code above in your browser using DataLab