Learn R Programming

stabs (version 0.6-4)

plot.stabsel: Plot and Print Methods for Stability Selection

Description

Display results of stability selection.

Usage

# S3 method for stabsel
plot(x, main = deparse(x$call), type = c("maxsel", "paths"),
     xlab = NULL, ylab = NULL, col = NULL, ymargin = 10, np = sum(x$max > 0),
     labels = NULL, ...)
# S3 method for stabsel
print(x, decreasing = FALSE, print.all = TRUE, ...)

Arguments

x

object of class stabsel.

main

main title for the plot.

type

plot type; either stability paths ("paths") or a plot of the maximum selection frequency ("maxsel").

xlab, ylab

labels for the x- and y-axis of the plot. Per default, sensible labels are used depending on the type of the plot.

col

a vector of colors; Typically, one can specify a single color or one color for each variable. Per default, colors depend on the maximal selection frequency of the variable and range from grey to red.

ymargin

(temporarily) specifies the y margin of of the plot in lines (see argument "mar" of function par). This only affects the right margin for type = "paths" and the left margin for type = "maxsel". Explicit user specified margins are kept and are not overwritten.

np

number of variables to plot for the maximum selection frequency plot (type = "maxsel"); the first np variables with highest selection frequency are plotted.

labels

variable labels for the plot; one label per variable / effect must be specified. Per default, the names of x$max are used.

decreasing

logical. Should the selection frequencies be printed in descending order (TRUE) or in ascending order (FALSE)?

print.all

logical. Should all selection frequencies be displayed or only those that are greater than zero?

additional arguments to plot and print functions.

Value

An object of class stabsel with a special print method. The object has the following elements:

phat

selection probabilities.

selected

elements with maximal selection probability greater cutoff.

max

maximum of selection probabilities.

cutoff

cutoff used.

q

average number of selected variables used.

PFER

per-family error rate.

sampling.type

the sampling type used for stability selection.

assumption

the assumptions made on the selection probabilities.

call

the call.

Details

This function implements the stability selection procedure by Meinshausen and Buehlmann (2010) and the improved error bounds by Shah and Samworth (2013).

Two of the three arguments cutoff, q and PFER must be specified. The per-family error rate (PFER), i.e., the expected number of false positives \(E(V)\), where \(V\) is the number of false positives, is bounded by the argument PFER.

As controlling the PFER is more conservative as controlling the family-wise error rate (FWER), the procedure also controlls the FWER, i.e., the probability of selecting at least one non-influential variable (or model component) is less than PFER.

References

B. Hofner, L. Boccuto and M. Goeker (2015), Controlling false discoveries in high-dimensional situations: Boosting with stability selection. BMC Bioinformatics, 16:144. 10.1186/s12859-015-0575-3.

N. Meinshausen and P. Buehlmann (2010), Stability selection. Journal of the Royal Statistical Society, Series B, 72, 417--473.

R.D. Shah and R.J. Samworth (2013), Variable selection with error control: another look at stability selection. Journal of the Royal Statistical Society, Series B, 75, 55--80.

See Also

stabsel

Examples

Run this code
# NOT RUN {
  if (require("TH.data")) {
      ## make data set available
      data("bodyfat", package = "TH.data")
  } else {
      ## simulate some data if TH.data not available. 
      ## Note that results are non-sense with this data.
      bodyfat <- matrix(rnorm(720), nrow = 72, ncol = 10)
  }
  
  ## set seed
  set.seed(1234)

  ####################################################################
  ### using stability selection with Lasso methods:

  if (require("lars")) {
      (stab.lasso <- stabsel(x = bodyfat[, -2], y = bodyfat[,2],
                             fitfun = lars.lasso, cutoff = 0.75,
                             PFER = 1))
      par(mfrow = c(2, 1))
      plot(stab.lasso, ymargin = 6)
      opar <- par(mai = par("mai") * c(1, 1, 1, 2.7))
      plot(stab.lasso, type = "paths")
  }
# }

Run the code above in your browser using DataLab