Learn R Programming

profileModel (version 0.6.1)

profileModel: Get the profiles of arbitrary objectives for arbitrary `glm'-like models

Description

Calculates the profiles of arbitrary objectives (inference functions in the terminology of Lindsay and Qu, 2003) for the parameters of arbitrary glm-like models with linear predictor. It provides a variety of options such as profiling over a pre-specified grid, profiling until the profile of the objective reaches the values of a quantile, calculating the profile traces along with the profiled objectives, and others.

Usage

profileModel(fitted, gridsize = 20, stdn = 5, stepsize = 0.5,
             grid.bounds = NULL, quantile = NULL,
             objective = stop("'objective' is missing."),
             agreement = TRUE, verbose = TRUE, trace.prelim = FALSE,
             which = 1:length(coef(fitted)), profTraces = TRUE,
             zero.bound = 1e-08, scale = FALSE,
             stdErrors = NULL, ...)

prelim.profiling(fitted, quantile = qchisq(0.95, 1), objective = stop("'objective' is missing."), verbose = TRUE, which = 1:length(coef(fitted)), stepsize = 0.5, stdn = 5, agreement = TRUE, trace.prelim = FALSE, stdErrors = NULL, ...)

profiling(fitted, grid.bounds, gridsize = 20, verbose = TRUE, objective = stop("'objective' is missing."), agreement = TRUE, which = 1:length(coef(fitted)), profTraces = TRUE, zero.bound = 1e-08, ...)

Arguments

fitted

a glm-like fitted object with linear predictor (see Details for the methods that have to be supported by fitted).

which

which parameters should be profiled? Has to be a vector of integers for profiling and prelim.profiling but for profileModel it could also be a vector of parameter names. The default is 1:length(coef(fitted)), i.e. all the parameters estimated in fitted.

grid.bounds

a matrix of dimension length(which) by 2 or a 2*length(which) vector that specifies the range of values in which profiling takes place for each parameter. It has to be set for profiling and the default is NULL for profileModel

gridsize

The number of equidistant parameter values to be taken between the values specified in the entries of grid.bounds.

stepsize

a positive integer that is used in prelim.profiling to penalize the size of the steps taken to the left and to the right of the estimate. The default value is 0.5.

stdn

in profileModel, the number of estimated standard deviations to move left or right from the estimated parameter value, when both quantile and grid.bounds are NULL. In prelim.profiling, stdn/stepsize is the maximum number of steps that are taken to the left and to the right of the estimate. The default value of stdn is 5 (see Details).

quantile

a quantile, indicating the range that the profile must cover. The default value in profileModel is NULL and in prelim.profiling, qchisq(0.95,1) (see Details).

objective

the function to be profiled. It is a function of the restricted fitted object and other arguments (see objectives). It should be of class function for profiling and prelim.profiling but it could also be a character string to be matched for profileModel.

agreement

logical indicating whether the fitting method used for fitting agrees with the specified objective, i.e. whether the objective is minimized at coef(fitted). The default is TRUE.

verbose

logical. If TRUE (default) progress indicators are printed during the profiling progress.

trace.prelim

logical. If TRUE the preliminary iteration is traced. The default is FALSE.

profTraces

logical indicating whether the profile traces should be returned. The default is TRUE.

zero.bound

a small positive constant. The difference of the objective at the restricted fit from the objective at fitted takes value zero if it is smaller than zero.bound. zero.bound is only used when agreement=TRUE and the default value is 1e-08.

scale

logical. The default is FALSE. Currently has no effect. Only available in profileModel.

stdErrors

The vector estimated asymptotic standard errors reported from the fitting procedure. The default is NULL (see Details).

further arguments passed to the specified objective.

Value

profiling returns a list of profiles, with one named component for each parameter profiled. Each component of the list contains the profiled parameter values and the corresponding differences of the objective at the restricted fit from the objective at fitted. When profTraces=TRUE the corresponding profile traces are cbind'ed to each component of the list.

prelim.profiling returns a list with components intersects and grid.bounds.

profileModel returns an object of class "profileModel" that has the attribute includes.traces corresponding to the value of the profTraces argument. The "profileModel" object is a list of the following components:

profiles

the result of profiling.

fit

the fitted object that was passed to profileModel.

quantile

the quantile that was passed to profileModel.

gridsize

the gridsize that was passed to profileModel.

intersects

if quantile=NULL then intersects=NULL else intersects is as for prelim.profiling.

profiled.parameters

a vector of integers indicating which parameters were profiled.

profiled.objective

the profiled objective with any additional arguments passed through evaluated.

isNA

a logical vector indicating which of the parameters in which were NA in fitted.

agreement

the agreement that was passed to profileModel.

zero.bound

the zero.bound that was passed to profileModel.

grid.bounds

the grid bounds that were used for profiling.

call

the matched call.

Details

fitted has to be an object which supports the method coef and which has fitted$terms with the same meaning as, for example, in lm and glm (see also terms). coef(fitted) has to be a vector of coefficients with each component corresponding to a column of the model matrix returned by

mf <- model.frame(fitted$terms,data=eval(fitted$call$data)) ; model.matrix(fitted$terms,mf,contrasts = fitted$contrasts)

(or just model.matrix(fitted), for fitted objects that support the model.matrix method.)

Exception to this are objects returned by BTm of the BradleyTerry package, where some special handling of the required objects takes place.

Note that any or both of data and contrasts could be NULL. This depends whether the data argument has been supplied to the procedure and whether fitted$contrast exists.

The fitting procedure that resulted fitted has to support offset in formula. Also, fitted$call has to be the call that generated fitted.

If the fitting procedure that resulted fitted supports an etastart argument (see glm) and fitted$linear.predictor contains the estimated linear predictors then during profiling, the appropriate starting values are supplied to the fitting procedure. In this way, the iteration is accelerated and is more stable, numerically. However, it is not necessary that etastart is supported. In the latter case no starting values are supplied to the fitting procedure during profiling.

Support for a summary method is optional. summary is only used for obtaining the estimated asymptotic standard errors associated to the coefficients in fitted. If stdErrors=NULL the standard errors are taken to be summary(fitted)$coefficients[,2] which is the place where the estimated asymptotic standard errors usually are for glm-like objects. If this this is not the case then stdErrors should be set appropriately.

profiling is the workhorse function that does the basic operation of profiling objectives over a user-specified grid of values. For a given parameter \(\beta\), the restricted fit \(F_{\beta=b}\) is calculated by constraining \(\beta\) to a point \(b\) of the grid. Then the difference

$$D(F_{\beta=b}) = P(F_{\beta=b}) - P(F_0),$$

is calculated, where \(P\) is the objective specified by the user and \(G\) is the original fit (fitted). For convex objectives that are minimized at the estimates of \(G\) (see agreement), \(D(G)=0\).

prelim.profiling refers only to convex objectives and searches for and returns the grid bounds (grid.bounds) for each profiled parameter that should be used in order the profile to cover quantile. For a given parameter \(\beta\), prelim.profiling also checks whether such enclosure can be found and returns a logical matrix intersects of dimension length(which) by 2 that indicates if the profile covers the quantile to the left and to the right of the estimate in fitted. At step i of the search a value \(b_i\) is proposed for \(\beta\) and \(D(F_{\beta=b_i})\) is calculated. If \(D(F_{\beta=b_i})<q\), where \(q\) is quantile, the next proposed value is

$$b_{i+1} = b_{i} \pm (i+1) C \min(s,30)/|L| ,$$

where \(C\) is stepsize, \(s\) is the estimated asymptotic standard error of \(\beta\) from \(G\) and \(L\) is the slope of the line segment connecting the points \((b_i, D(F_{\beta=b_i}))\) and \((b_{i-1}, D(F_{\beta=b_{i-1}}))\). \(\pm\) is \(+\) if the search is on the right of the estimate of \(\beta\) and \(-\) on the left. If an increase of \(D\) is expected then the step slows down. If \(|L|<1\) then \(|L|\) is set to 1 and if \(|L|>500\) then \(|L|\) is set to 500. In this way the iteration is conservative by avoiding very small steps but not over-conservative by avoiding very large steps.

If the maximum number of steps stdn/stepsize (call this \(M\)) was taken and the quantile was not covered by the profile but the three last absolute slopes where positive then the iteration is restarted form \(b_{M-1}\) with \(2C\) instead of \(C\) in the step calculation. If the three last slopes were less than 1e-8 in absolute value then the iteration stops and it is considered that \(D\) has an asymptote at the corresponding direction (left or right). Note that when the latter takes place the iteration has already moved \(6 C\min(s,30)\) units on the scale of \(\beta\), since the first value of \(b\) were a slope of 1e-8 in absolute value was detected. Thus we could safely say that an asymptote has been detected and avoid calculation of \(F_{beta=b}\) for extremely large \(b\)'s.

Very small values of stepsize make prelim.profiling take very small steps with the effect of slowing down the execution time. Large values of stepsize are only recommended when the estimated asymptotic standard errors are very small in fitted.

profileModel is a wrapper function that collects and combines the capabilities of profiling and prelim.profiling by providing a unified interface for their functions, as well as appropriateness checks on the arguments. When both quantile and grid.bounds are NULL then profiling is called and profiling takes place for stdn estimated asymptotic standard errors on the left and on the right of the estimates in fitted. This could be used for taking a quick look of the profiles around the estimate. With only the quantile being NULL, profiling is performed on the the specified grid of values. When quantile is specified and grid.bounds is NULL, prelim.profiling is called and its result is passed to profiling. If both quantile and grid.bounds then grid.bounds prevails and profiling is performed on the specified grid.

References

Lindsay, B. G. and Qu, A. (2003). Inference functions and quadratic score tests. Statistical Science 18, 394--410.

Chambers, J. M. and Hastie, T. J. (1992) Statistical Models in S. Chapman \& Hall/CRC.

See Also

confintModel, plot.profileModel.

Examples

Run this code
# NOT RUN {
## Begin Example 1
library(MASS)
m1 <- glm(Claims ~ District + Group + Age + offset(log(Holders)),
          data = Insurance, family = poisson)
# profile deviance +-5 estimated standard errors from the estimate
prof0 <- profileModel(m1, objective = "ordinaryDeviance")
# profile deviance over a grid of values
gridd <- rep(c(-1,1), length(coef(m1)))
prof1 <- profileModel(m1, grid.bounds = gridd,
                      objective = "ordinaryDeviance")
# profile deviance until the profile reaches qchisq(0.95,1)
prof2 <- profileModel(m1, quantile = qchisq(0.95,1) ,
                      objective = "ordinaryDeviance")
# plot the profiles of the deviance
plot(prof2)
# quite quadratic in shape. Just to make sure:
plot(prof2, signed = TRUE)
# Ok straight lines. So we expect first order asymptotics to work well;
# }
# NOT RUN {
# plot the profiles of the Rao score statistic
# profile Rao's score statistic
prof3 <- update(prof2, objective = "RaoScoreStatistic",
                X = model.matrix(m1))
plot(prof3)
# The 95% confidence intervals based on prof2 and prof3 and the simple Wald
# confidence intervals:
profConfint(prof2)
profConfint(prof3)
stdErrors <- coef(summary(m1))[,2]
coef(m1)+ qnorm(0.975) * cbind(-stdErrors,stdErrors)
# They are all quite similar in value. The result of a quadratic likelihood.
## End Example
# }
# NOT RUN {
## Begin Example 2: Monotone likelihood; data separation;
library(MASS)
y <- c(0, 0, 1, 0)
tots <- c(2, 2, 5, 2)
x1 <- c(1, 0, 1, 0)
x2 <- c(1, 1, 0, 0)
m2 <- glm(y/tots ~ x1 + x2, weights = tots,
          family = binomial)
prof <- profileModel(m2, quantile=qchisq(0.95,1),
                     objective = "ordinaryDeviance")
plot(prof)
profConfint(prof)
# profile.glm fails to detect the finite endpoints
confint(m2)
## End Example
# }
# NOT RUN {
## Begin Example 3: polr
library(MASS)
options(contrasts = c("contr.treatment", "contr.poly"))
house.plr <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
prof.plr0 <- profileModel(house.plr, objective = function(fm) fm$deviance)
plot(prof.plr0)
# do it with a quantile
prof.plr1 <- update(prof.plr0, quantile = qchisq(0.95, 1))
plot(prof.plr1)
## End Example
# }

Run the code above in your browser using DataLab