Learn R Programming

Matrix (version 0.999375-42)

lm.fit.sparse: Fitter Function for Sparse Linear Models

Description

The basic computing engine for sparse linear least squares regression.

Note that the exact interface (arguments, return value) currently is experimental, and is bound to change. Use at your own risk!

Usage

lm.fit.sparse(x, y, w = NULL, offset = NULL,
              method = c("qr", "cholesky"),
              tol = 1e-7, singular.ok = TRUE, order = NULL,
              transpose = FALSE)

Arguments

x
sparse design matrix of dimension n * p, i.e., an Robject of a class extending dsparseMatrix; typically the result of
y
vector of observations of length n, or a matrix with n rows.
w
vector of weights (length n) to be used in the fitting process. Weighted least squares is used with weights w, i.e., sum(w * e^2) is minimized.

Not yet implemented !

offset
numeric of length n). This can be used to specify an a priori known component to be included in the linear predictor during fitting.
method
a character string specifying the (factorization) method. Currently, "qr" or "cholesky".
tol
[for back-compatibility only; unused:] tolerance for the qr decomposition. Default is 1e-7.
singular.ok
[for back-compatibility only; unused:] logical. If FALSE, a singular model is an error.
order
integer or NULL, for method == "qr", will determine how the fill-reducing ordering (aka permutation) for the symbolic part is determined (in cs_amd()), with the options {0: natural}, {1:
transpose
logical; if true, use the transposed matrix t(x) instead of x.

Value

  • Either a single numeric vector or a list of four numeric vectors.

See Also

sparse.model.matrix; the non-sparse function in standard R's package stats: lm.fit().

Examples

Run this code
dd <- expand.grid(a = as.factor(1:3),
                  b = as.factor(1:4),
                  c = as.factor(1:2),
                  d= as.factor(1:8))
n <- nrow(dd <- dd[rep(seq_len(nrow(dd)), each = 10), ])
set.seed(17)
dM <- cbind(dd, x = round(rnorm(n), 1))
## randomly drop some
n <- nrow(dM <- dM[- sample(n, 50),])
dM <- within(dM, { A <- c(2,5,10)[a]
                   B <- c(-10,-1, 3:4)[b]
                   C <- c(-8,8)[c]
                   D <- c(10*(-5:-2), 20*c(0, 3:5))[d]
   Y <- A + B + A*B + C + D + A*D + C*x + rnorm(n)/10
   wts <- sample(1:10, n, replace=TRUE)
   rm(A,B,C,D)
})
str(dM) # 1870 x 7

X <- sparse.model.matrix( ~ (a+b+c+d)^2 + c*x, data = dM)
dim(X) # 1870 x 69
X[1:10, 1:20]

## For now, use  'Matrix:::'  --- TODO : export once interface is clear!

Xd <- as(X,"matrix")
system.time(fmDense <- lm.fit(Xd, y = dM[,"Y"]))
system.time( r1 <- Matrix:::lm.fit.sparse(X, y = dM[,"Y"]) ) # *is* faster
stopifnot(all.equal(r1, unname(fmDense$coeff), tol = 1e-12))
system.time(
     r2 <- Matrix:::lm.fit.sparse(X, y = dM[,"Y"], method = "chol") )
stopifnot(all.equal(r1, r2$coef, tol = 1e-12),
          all.equal(fmDense$residuals, r2$residuals, tol=1e-9)
         )
## with weights:
system.time(fmD.w <- with(dM, lm.wfit(Xd, Y, w = wts)))
system.time(fm.w1 <- with(dM, Matrix:::lm.fit.sparse(X, Y, w = wts)))
system.time(fm.w2 <- with(dM, Matrix:::lm.fit.sparse(X, Y, w = wts,
                                                     method = "chol") ))
stopifnot(all.equal(fm.w1, unname(fmD.w$coeff), tol = 1e-12),
          all.equal(fm.w2$coef, fm.w1, tol = 1e-12),
          all.equal(fmD.w$residuals, fm.w2$residuals, tol=1e-9)
          )

Run the code above in your browser using DataLab