Reduce dimensionality of DSM by linear projection of row vectors into a lower-dimensional subspace. Various projections methods with different properties are available.
dsm.projection(model, n,
method = c("svd", "rsvd", "asvd", "ri", "ri+svd"),
oversampling = NA, q = 2, rate = .01, power=1,
with.basis = FALSE, verbose = FALSE)
A numeric matrix with n
columns (latent dimensions) and the same number of rows as the original DSM. Some SVD-based algorithms may discard poorly conditioned singular values, returning fewer than n
columns.
If with.basis=TRUE
and an orthogonal projection is used, the corresponding orthogonal basis \(B\) of the latent subspace is returned as an attribute "basis"
. \(B\) is column-orthogonal, hence \(B^T\) projects into latent coordinates and \(B B^T\) is an orthogonal subspace projection in the original coordinate system.
For orthogonal projections, the attribute "R2"
contains a numeric vector specifying the proportion of the squared Frobenius norm of the original matrix captured by each of the latent dimensions. If the original matrix has been centered (so that a SVD projection is equivalent to PCA), this corresponds to the proportion of variance “explained” by each dimension.
For SVD-based projections, the attribute "sigma"
contains the singular values corresponding to latent dimensions. It can be used to adjust the power scaling exponent at a later time.
either an object of class dsm
, or a dense or sparse numeric matrix
projection method to use for dimensionality reduction (see “DETAILS” below)
an integer specifying the number of target dimensions. Use n=NA
to generate as many latent dimensions as possible (i.e. the minimum of the number of rows and columns of the DSM matrix).
oversampling factor for stochastic dimensionality reduction algorithms (rsvd
, asvd
, ri+svd
). If unspecified, the default value is 2 for rsvd
, 10 for asvd
and 10 for ri+svd
(subject to change).
number of power iterations in the randomized SVD algorithm (Halko et al. 2009 recommend q=1
or q=2
)
fill rate of random projection vectors. Each random dimension has on average rate * ncol(model)
nonzero components in the original space
apply power scaling after SVD-based projection, i.e. multiply each latent dimension with a suitable power of the corresponding singular value.
The default power=1
corresponds to a regular orthogonal projection. For power \(> 1\), the first SVD dimensions -- i.e. those capturing the main patterns of \(M\) -- are given more weight; for power \(< 1\), they are given less weight. The setting power=0
results in a full equalization of the dimensions and is also known as “whitening” in the PCA case.
if TRUE
, also returns orthogonal basis of the subspace as attribute of the reduced matrix (not available for random indexing methods)
if TRUE
, some methods display progress messages during execution
Stephanie Evert (https://purl.org/stephanie.evert)
The following dimensionality reduction algorithms can be selected with the method
argument:
singular value decomposition (SVD), using the efficient SVDLIBC algorithm (Berry 1992) from package sparsesvd if the input is a sparse matrix. If the DSM has been scored with scale="center"
, this method is equivalent to principal component analysis (PCA).
randomized SVD (Halko et al. 2009, p. 9) based on a factorization of rank oversampling * n
with q
power iterations.
approximate SVD, which determines latent dimensions from a random sample of matrix rows including oversampling * n
data points. This heuristic algorithm is highly inaccurate and has been deprecated.
random indexing (RI), i.e. a projection onto random basis vectors that are approximately orthogonal. Basis vectors are generated by setting a proportion of rate
elements randomly to \(+1\) or \(-1\). Note that this does not correspond to a proper orthogonal projection, so the resulting coordinates in the reduced space should be used with caution.
RI to oversampling * n
dimensions, followed by SVD of the pre-reduced matrix to the final n
dimensions. This is not a proper orthogonal projection because the RI basis vectors in the first step are only approximately orthogonal.
Berry, Michael~W. (1992). Large scale sparse singular value computations. International Journal of Supercomputer Applications, 6, 13--49.
Halko, N., Martinsson, P. G., and Tropp, J. A. (2009). Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions. Technical Report 2009-05, ACM, California Institute of Technology.
rsvd
for the implementation of randomized SVD, and sparsesvd
for the SVDLIBC wrapper
# 240 English nouns in space with correlated dimensions "own", "buy" and "sell"
M <- DSM_GoodsMatrix[, 1:3]
# SVD projection into 2 latent dimensions
S <- dsm.projection(M, 2, with.basis=TRUE)
100 * attr(S, "R2") # dim 1 captures 86.4% of distances
round(attr(S, "basis"), 3) # dim 1 = commodity, dim 2 = owning vs. buying/selling
S[c("time", "goods", "house"), ] # some latent coordinates
if (FALSE) {
idx <- DSM_GoodsMatrix[, 4] > .85 # only show nouns on "fringe"
plot(S[idx, ], pch=20, col="red", xlab="commodity", ylab="own vs. buy/sell")
text(S[idx, ], rownames(S)[idx], pos=3)
}
Run the code above in your browser using DataLab