Learn R Programming

robustHD (version 0.8.1)

setupCoefPlot: Set up a coefficient plot of a sequence of regression models

Description

Extract the relevent information for a plot of the coefficients for a sequence of regression models, such as submodels along a robust or groupwise least angle regression sequence, or sparse least trimmed squares regression models for a grid of values for the penalty parameter.

Usage

setupCoefPlot(object, ...)

# S3 method for seqModel setupCoefPlot(object, zeros = FALSE, labels = NULL, ...)

# S3 method for tslars setupCoefPlot(object, p, ...)

# S3 method for sparseLTS setupCoefPlot( object, fit = c("reweighted", "raw", "both"), zeros = FALSE, labels = NULL, ... )

Value

An object inheriting from class "setupCoefPlot" with the following components:

coefficients

a data frame containing the following columns:

fit

the model fit for which the coefficient is computed (only returned if both the reweighted and raw fit are requested in the "sparseLTS" method).

lambda

the value of the penalty parameter for which the coefficient is computed (only returned for the "sparseLTS" method).

step

the step along the sequence for which the coefficient is computed.

df

the degrees of freedom of the submodel along the sequence for which the coefficient is computed.

coefficient

the value of the coefficient.

variable

a character string specifying to which variable the coefficient belongs.

abscissa

a character string specifying available options for what to plot on the \(x\)-axis

lambda

a numeric vector giving the values of the penalty parameter. (only returned for the "sparseLTS" method).

step

an integer vector containing the steps for which submodels along the sequence have been computed.

df

an integer vector containing the degrees of freedom of the submodels along the sequence (i.e., the number of estimated coefficients; only returned for the "seqModel" method).

includeLabels

a logical indicating whether information on labels for the variables should be included in the plot.

labels

a data frame containing the following columns (not returned if information on labels is suppressed):

fit

the model fit for which the coefficient is computed (only returned if both the reweighted and raw fit are requested in the "sparseLTS" method).

lambda

the smallest value of the penalty parameter (only returned for the "sparseLTS" method).

step

the last step along the sequence.

df

the degrees of freedom of the last submodel along the sequence.

coefficient

the value of the coefficient.

label

the label of the corresponding variable to be displayed in the plot.

facets

default faceting formula for the plots (only returned if both estimators are requested in the "sparseLTS" method).

Arguments

object

the model fit from which to extract information.

...

additional arguments to be passed down.

zeros

a logical indicating whether predictors that never enter the model and thus have zero coefficients should be included in the plot (TRUE) or omitted (FALSE, the default). This is useful if the number of predictors is much larger than the number of observations, in which case many coefficients are never nonzero.

labels

an optional character vector containing labels for the predictors. Information on labels can be suppressed by setting this to NA.

p

an integer giving the lag length for which to extract information (the default is to use the optimal lag length).

fit

a character string specifying for which estimator to extract information. Possible values are "reweighted" (the default) for the reweighted fits, "raw" for the raw fits, or "both" for both estimators.

Author

Andreas Alfons

See Also

coefPlot, rlars, grplars, rgrplars, tslarsP, rtslarsP, tslars, rtslars, sparseLTS

Examples

Run this code
## generate data
# example is not high-dimensional to keep computation time low
library("mvtnorm")
set.seed(1234)  # for reproducibility
n <- 100  # number of observations
p <- 25   # number of variables
beta <- rep.int(c(1, 0), c(5, p-5))  # coefficients
sigma <- 0.5      # controls signal-to-noise ratio
epsilon <- 0.1    # contamination level
Sigma <- 0.5^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
x <- rmvnorm(n, sigma=Sigma)    # predictor matrix
e <- rnorm(n)                   # error terms
i <- 1:ceiling(epsilon*n)       # observations to be contaminated
e[i] <- e[i] + 5                # vertical outliers
y <- c(x %*% beta + sigma * e)  # response
x[i,] <- x[i,] + 5              # bad leverage points


## robust LARS
# fit model
fitRlars <- rlars(x, y, sMax = 10)
# extract information for plotting
setup <- setupCoefPlot(fitRlars)
coefPlot(setup)


## sparse LTS over a grid of values for lambda
# fit model
frac <- seq(0.2, 0.05, by = -0.05)
fitSparseLTS <- sparseLTS(x, y, lambda = frac, mode = "fraction")
# extract information for plotting
setup1 <- setupCoefPlot(fitSparseLTS)
coefPlot(setup1)
setup2 <- setupCoefPlot(fitSparseLTS, fit = "both")
coefPlot(setup2)

Run the code above in your browser using DataLab