Learn R Programming

Hmisc (version 5.0-1)

princmp: princmp

Description

Enhanced Output for Principal and Sparse Principal Components

Usage

princmp(
  formula,
  data = environment(formula),
  method = c("regular", "sparse"),
  k = min(5, p),
  kapprox = min(5, k),
  cor = TRUE,
  offset = 0.8,
  col = 1,
  adj = 0,
  scoef = TRUE,
  orig = TRUE,
  pl = TRUE,
  ylim = NULL,
  add = FALSE,
  sw = FALSE,
  nvmax = 5
)

Value

a k-column matrix with principal component scores, with NAs when the input data had an NA. If k=1 the result is a vector.

Arguments

formula

a formula with no left hand side, or a numeric matrix

data

a data frame or table. By default variables come from the calling environment.

method

specifies whether to use regular or sparse principal components are computed

k

the number of components to plot, display, and return

kapprox

the number of components to approximate with stepwise regression when sw=TRUE

cor

set to FALSE to compute PCs on the original data scale, which is useful if all variables have the same units of measurement

offset

controls positioning of text labels for cumulative fraction of variance explained

col

color of plotted text

adj

angle for plotting text

scoef

set to FALSE to not print coefficients (loadings) of standardized variables

orig

set to FALSE to not show coefficients on the original scale

pl

set to FALSE to not make the scree plot

ylim

y-axis plotting limits, a 2-vector

add

set to TRUE to add to an existing plot

sw

set to TRUE to run stepwise regression PC prediction/approximation

nvmax

maximum number of predictors to allow in stepwise regression PC approximations

Author

Frank Harrell

Details

Expands any categorical predictors into indicator variables, and calls princomp (if method='regular' (the default)) or sPCAgrid in the pcaPP package (method='sparse') to compute lasso-penalized sparse principal components. By default all variables are first scaled by their standard deviation after observations with any NAs on any variables in formula are removed. Loadings of standardized variables, and if orig=TRUE loadings on the original data scale are printed. If pl=TRUE a scree plot is drawn with text added to indicate cumulative proportions of variance explained. If sw=TRUE, the leaps package regsubsets function is used to approximate the PCs using forward stepwise regression with the original variables as individual predictors.