estimates a matrix X for which: $$(A+\epsilon_A)X = B+\epsilon_B$$ minimize \(\sum{\epsilon_A^2 + \epsilon_B^2}\) $$\sum{X_{i,}}=1 \forall i$$ $$X>0$$ the elements of \(\epsilon_A\) are NULL if the corresponding elements of A are NULL. A typically contains biomarker concentrations for several taxonomic groups, and B field measurements of the same biomarkers. X is then an estimate of the taxonomic composition of the field sample.
tlsce(A, B, Wa=NULL, Wb=NULL, minA=NULL, maxA=NULL,
A_init=A, Xratios=TRUE, ...)
a matrix or data frame. If A contains biomarker data for taxonomic groups, the biomarkers have to be organized per row, and the taxonomic groups per column.
a matrix or data frame. If B contains biomarker field data, the biomarkers have to be organized per row, and the samples per column.
weighting of A, a matrix with the same dimensions of A. If
Wa=NULL
, Wa defaults to 1. This parameter can be used to give
more importance to elements of A or A in total compared to
B. weights are implemented as
proportional to \(1/s\) (as opposed to \(1/s^2\)) with s the
standard deviation of the error term.
weighting of B, a matrix with the same dimensions of B. If
Wb=NULL
, Wb defaults to 1. This parameter can be used to give
more importance to elements of B or B in total compared to
A. weights are implemented as
proportional to \(1/s\) (as opposed to \(1/s^2\)) with s the
standard deviation of the error term.
minimum values for A
maximum values for A
a matrix with the same structure as A. a general,
non-linear optimization routine (default nlminb
) is used to
minimize the sum of squared residuals of A versus the fitted matrix
A\_fit (see value). This optimization routine requires a set of
starting values, by default the non-zero elements of A. This
provides a good fit, but when in doubt about the convergence of the
algorithm, one can provide different starting values for the
optimization routine in A\_init.
TRUE or FALSE: are the colSums of the matrix X equal to 1? This is for example the case in a compositional matrix. (only if A and B are both expressed relative to the unit of biomass) if Xratios =TRUE, A has pigment concentrations per biomass unit, B has pigment concentrations per biomass unit per sample, and X contains ratios of biomass unit per sample. if Xratios =FALSE, A has pigment concentrations per biomass unit, B has pigment concentrations per sample, and X has biomass units per sample
Arguments to be passed to lsei() or to modFit()
A list with the following elements:
Array with dimension c(ncol(A
),ncol(B
),
iter
) containing the species composition of each sample
Array with same dimension as A
, containing the
best-fit values of the input biomarker data per taxonomic group
Array with same dimension as B
, containing the
biomarker field data, corresponding to Afit
a vector of 3 values:
the value of the minimised quadratic function at the solution, in this case \(\sum{(Afit-A)*Wa)^2 + (Bfit-B)^2}\),
and the shares of this value attributed to A and to B
An integer code. '0' indicates successful convergence.
instead of a linear least squares regression, in which the
elements of A would be fixed, the function tlsce
includes the
non-zero elements of A in the least squares regression. This is
similar to other total least squares regression methods (also called
orthogonal regression), with the main
difference that only non-zero elements of A contain an error term.
Van den Meersche, K., K. Soetaert and J.J. Middelburg (2008) A Bayesian compositional estimator for microbial taxonomy based on biomarkers, Limnology and Oceanography Methods 6, 190-199
# NOT RUN {
A <- t(bceInput$Rat)
B <- t(bceInput$Dat)
tlsce(A,B)
## weighting Wa inversely proportional to A
tlsce(A,B,Wa=1/A)
# }
Run the code above in your browser using DataLab