Compute a low-rank matrix factorisation \( \min_{\mathbf A, \mathbf B} || (\mathbf X - \mathbf A \mathbf{B}^\top ) \circ \mathbf W ||_\mathrm F \) subject to weights \(\mathbf W\) (set to \(0\) where \(\mathbf X\) is not defined) and constraints on rows of \(\mathbf{A}, \mathbf{B}\).
Solve the weighted multivariate least squares problem \( \min_\mathbf{B} || (\mathbf X - \mathbf A \mathbf{B}^\top) \circ \mathbf W ||_\mathrm F \) subject to constraints on rows of \(\mathbf B\).
This is not a public interface. Subject to change without further notice. Please do not call from outside albatross.
cmf(
X, nfac = 1,
const = list(list(const = "nonneg"), list(const = "nonneg")),
start = c("svd", "random"), ctol = 1e-04, maxit = 10
)
# S3 method for cmf
fitted(object, ...)
wcmls(X, A, W, ..., struc = NULL)
An list of class cmf
containing the
\(\mathbf A, \mathbf B\) matrices.
The \(\mathbf B\) matrix solving the constrained weighted multivariate least squares problem.
A matrix reconstructed from its nfac
-rank decomposition.
The matrix for a low-rank approximation.
The rank of the factorisation; the number of columns in matrices \(\mathbf A, \mathbf B\).
Constraints on the two matrices: a list of two lists of arguments to
pass to wcmls
when computing the corresponding matrix.
A cmf
object to take the starting values from.
Alternatively, a string:
Compute a truncated SVD \( \mathbf X = \mathbf U \, \mathrm{diag}(\sigma_1, \dots, \sigma_k) \, \mathbf{V}^\top \). Use \( \mathbf A = \mathbf U \, \mathrm{diag}(\sqrt{\sigma_1}, \dots, \sqrt{\sigma_k}) \), \( \mathbf B = \mathbf V \, \mathrm{diag}(\sqrt{\sigma_1}, \dots, \sqrt{\sigma_k}) \) as the starting values.
Use uniformly distributed nonnegative starting values rescaled to be of comparable norms.
Given \(L = || (\mathbf X - \mathbf A \mathbf{B}^\top ) \circ \mathbf W ||_\mathrm F\), stop when \( \frac{|\Delta L|}{L} \le \mathtt{ctol} \).
Iteration number limit.
An object of class cmf
.
The predictor matrix in the weighted multivariate least squares problem.
The weights matrix.
Passed to cmls
.
Ignored.
The CMLS package function cmls
can solve
constrained multivariate least squares problems of the form:
$$ \min_\mathbf{B} || \mathbf X - \mathbf A \mathbf B ||_\mathrm F = L(\mathbf X, \mathbf A, \mathbf B) $$
We use it to solve a weighted problem. Let \(\mathbf X, \mathbf W\) be \((m \times n)\) matrices, \(\mathbf A\) be an \((m \times k)\) matrix, \(\mathbf B\) be an \((n \times k)\) matrix, \(\mathbf{J}_{p,q}\) be a \((p \times q)\) matrix of ones:
$$ \min_\mathbf{B} || \mathbf W \circ (\mathbf X - \mathbf A \mathbf B^\top) ||_\mathrm F = \sum_{i,j} ( w_{i,j} x_{i,j} - w_{i,j} \mathbf{a}_{i,\cdot} \mathbf{b}_{j,\cdot}^\top )^2 = {} $$ $$ {} = \sum_j || \mathbf{w}_{\cdot,j} \circ \mathbf{x}_{\cdot,j} - ( (\mathbf{w}_{\cdot,j} \mathbf{J}_{1,k}) \circ \mathbf A ) \mathbf{b}_{j,\cdot}^\top ||_\mathrm F = \sum_j L( \mathbf{w}_{\cdot,j} \circ \mathbf{x}_{\cdot,j}, (\mathbf{w}_{\cdot,j} \mathbf{J}_{1,k}) \circ \mathbf A, \mathbf{b}_{j,\cdot}^\top ) $$
Here, \(\mathbf{w}_{\cdot,j}\)
and \(\mathbf{x}_{\cdot,j}\)
are columns of \(\mathbf W\) and
\(\mathbf X\), while
\(\mathbf{a}_{i,\cdot}\) and
\(\mathbf{b}_{j,\cdot}\) are
rows of \(\mathbf A\) and \(\mathbf B\),
respectively. Thus, in the weighted case, the
\(\mathbf B\) matrix is determined row by row by
calling the cmls
function for pre-processed
\(\mathbf A\) matrix and columns of
\(\mathbf X\).
The problem we're actually interested in is a low-rank approximation of \(\mathbf X\). It doesn't have a unique solution, especially if the rank is more than \(1\), unless we apply constraints and some luck. We solve it by starting with (typically) SVD and refining the solution with alternating least squares until it satisfies the constraints: \( \min_\mathbf{B} || (\mathbf X - \mathbf A \mathbf{B}^\top) \circ \mathbf W ||_\mathrm F \) and \( \min_\mathbf{A} || (\mathbf{X}^\top - \mathbf B \mathbf{A}^\top) \circ \mathbf{W}^\top ||_\mathrm F \).
albatross:::.Rdreference('deJuan2014')
cmls
; the ALS package.
data(feems)
z <- feemscatter(feems$a, rep(25, 4), 'omit')
str(zf <- albatross:::cmf(unclass(z)))
str(albatross:::fitted.cmf(zf))
Run the code above in your browser using DataLab