Learn R Programming

gmodels (version 2.18.1.1)

fast.prcomp: Efficient computation of principal components and singular value decompositions.

Description

The standard prcomp and svd function are very inefficient for wide matrixes. fast.prcomp and fast.svd are modified versions which are efficient even for matrixes that are very wide.

Usage

fast.prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE, tol = NULL)
  fast.svd( x, nu = min(n, p), nv = min(n, p), ...)

Value

See the documetation for prcomp or

svd .

Arguments

x

data matrix

retx, center, scale., tol

See documetation for prcomp

nu, nv, ...

See documetation for svd

Author

Modifications by Gregory R. Warnes greg@warnes.net

Details

The current implementation of the function svd in S-Plus and R is much slower when operating on a matrix with a large number of columns than on the transpose of this matrix, which has a large number of rows. As a consequence, prcomp, which uses svd, is also very slow when applied to matrixes with a large number of rows.

For R, the simple solution is to use La.svd instead of svd. A suitable patch to prcomp has been submitted. In the mean time, the function fast.prcomp has been provided as a short-term work-around.

For S-Plus the solution is to replace the standard svd with a version that checks the dimensions of the matrix, and performs the computation on the transposed the matrix if it is wider than tall.

For R:

fast.prcomp

is a modified versiom of prcomp that calls La.svd instead of svd

fast.svd

is simply a wrapper around La.svd.

For S-Plus:

fast.prcomp

is a modified versiom of prcomp that calls fast.svd instead of svd

fast.svd

checks the dimensions of the matrix. When it is wider than tall, it transposes the input matrix and calls svd. It then swaps u and v and returns the result. Otherwise, it just calls svd and returns the results unchanged.

See Also

Examples

Run this code

  # create test matrix
  set.seed(4943546)
  nr <- 50
  nc <- 2000
  x  <- matrix( rnorm( nr*nc), nrow=nr, ncol=nc )
  tx <- t(x)

  # SVD directly on matrix is SLOW:
  system.time( val.x <- svd(x)$u )

  # SVD on t(matrix) is FAST:
  system.time( val.tx <- svd(tx)$v )

  # and the results are equivalent:
  max( abs(val.x) - abs(val.tx) )

  # Time gap dissapears using fast.svd:
  system.time( val.x <- fast.svd(x)$u )
  system.time( val.tx <- fast.svd(tx)$v )
  max( abs(val.x) - abs(val.tx) )


  library(stats)

  # prcomp directly on matrix is SLOW:
  system.time( pr.x <- prcomp(x) )

  # prcomp.fast is much faster
  system.time( fast.pr.x <- fast.prcomp(x) )

  # and the results are equivalent
  max( pr.x$sdev - fast.pr.x$sdev )
  max( abs(pr.x$rotation[,1:49]) - abs(fast.pr.x$rotation[,1:49]) )
  max( abs(pr.x$x) - abs(fast.pr.x$x)  )

  # (except for the last and least significant component):
  max( abs(pr.x$rotation[,50]) - abs(fast.pr.x$rotation[,50]) )

Run the code above in your browser using DataLab