Learn R Programming

Rdimtools (version 1.0.4)

do.lasso: Least Absolute Shrinkage and Selection Operator

Description

LASSO is a popular regularization scheme in linear regression in pursuit of sparsity in coefficient vector that has been widely used. The method can be used in feature selection in that given the regularization parameter, it first solves the problem and takes indices of estimated coefficients with the largest magnitude as meaningful features by solving $$\textrm{min}_{\beta} ~ \frac{1}{2}\|X\beta-y\|_2^2 + \lambda \|\beta\|_1$$ where \(y\) is response in our method.

Usage

do.lasso(
  X,
  response,
  ndim = 2,
  preprocess = c("null", "center", "scale", "cscale", "whiten", "decorrelate"),
  ycenter = FALSE,
  lambda = 1
)

Arguments

X

an \((n\times p)\) matrix whose rows are observations and columns represent independent variables.

response

a length-\(n\) vector of response variable.

ndim

an integer-valued target dimension.

preprocess

an additional option for preprocessing the data. Default is "null". See also aux.preprocess for more details.

ycenter

a logical; TRUE to center the response variable, FALSE otherwise.

lambda

sparsity regularization parameter in \((0,\infty)\).

Value

a named list containing

Y

an \((n\times ndim)\) matrix whose rows are embedded observations.

featidx

a length-\(ndim\) vector of indices with highest scores.

trfinfo

a list containing information for out-of-sample prediction.

projection

a \((p\times ndim)\) whose columns are basis for projection.

References

tibshirani_regression_1996Rdimtools

Examples

Run this code
# NOT RUN {
## generate swiss roll with auxiliary dimensions
## it follows reference example from LSIR paper.
set.seed(1)
n = 123
theta = runif(n)
h     = runif(n)
t     = (1+2*theta)*(3*pi/2)
X     = array(0,c(n,10))
X[,1] = t*cos(t)
X[,2] = 21*h
X[,3] = t*sin(t)
X[,4:10] = matrix(runif(7*n), nrow=n)

## corresponding response vector
y = sin(5*pi*theta)+(runif(n)*sqrt(0.1))

## try different regularization parameters
out1 = do.lasso(X, y, lambda=0.1)
out2 = do.lasso(X, y, lambda=1)
out3 = do.lasso(X, y, lambda=10)

## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(out1$Y, main="LASSO::lambda=0.1")
plot(out2$Y, main="LASSO::lambda=1")
plot(out3$Y, main="LASSO::lambda=10")
par(opar)

# }

Run the code above in your browser using DataLab