A principal component analysis is done in the Aitchison geometry (i.e. clr-transform) of the simplex. Some gimics simplify the interpretation of the computed components as compositional perturbations.
# S3 method for acomp
princomp(x,...,scores=TRUE,center=attr(covmat,"center"),
covmat=var(x,robust=robust,giveCenter=TRUE),
robust=getOption("robust"))
# S3 method for princomp.acomp
print(x,...)
# S3 method for princomp.acomp
plot(x,y=NULL,..., npcs=min(10,length(x$sdev)),
type=c("screeplot","variance","biplot","loadings","relative"),
main=NULL,scale.sdev=1)
# S3 method for princomp.acomp
predict(object,newdata,...)
princomp
gives an object of type
c("princomp.acomp","princomp")
with the following content:
the standard deviation of the principal components
the matrix of variable loadings (i.e., a matrix which
columns contain the eigenvectors). This is of class
"loadings"
. The last eigenvector is removed since it should
contain the irrelevant scaling.
the clr-transformed vector of means used to center the dataset
the acomp
vector of means used to center the dataset
the scaling applied to each variable
number of observations
if scores = TRUE
, the scores of the supplied data
on the principal components. Scores are coordinates in a basis given by the principal
components and thus not compositions
the matched call
not clearly understood
compositions that represent a perturbation with the vectors represented by the loadings of each of the factors
compositions that represent a perturbation with the inverse of the vectors represented by the loadings of each of the factors
predict
returns a matrix of scores of the observations in the
newdata
dataset
.
The other routines are mainly called for their side effect of plotting or
printing and return the object x
.
a acomp-dataset (in princomp) or a result from princomp.acomp
not used
a logical indicating whether scores should be computed or not
the number of components to be drawn in the scree plot
type of the plot: "screeplot"
is a lined screeplot,
"variance"
is a boxplot-like screeplot, "biplot"
is a
biplot, "loadings"
displays the loadings as a
barplot.acomp
the multiple of sigma to use plotting the loadings
title of the plot
a fitted princomp.acomp object
another compositional dataset of class acomp
further arguments to pass to internally-called functions
provides the covariance matrix to be used for the principle component analysis
provides the be used for the computation of scores
Gives the robustness type for the calculation of the
covariance matrix. See robustnessInCompositions
for details.
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
As a metric euclidean space the Aitchison simplex has its own
principal component analysis, that should be performed in terms of the
covariance matrix and not in terms of the meaningless correlation
matrix.
To aid the interpretation we added some extra functionality to a
normal princomp(clr(x))
. First of all the result contains as
additional information the compositional representation of the
returned vectors in the space of the data: the center as a composition
Center
, and the loadings in terms of a composition to perturbe
with, either positively
(Loadings
) or negatively (DownLoadings
). The Up- and
DownLoadings are normalized to the number of parts in the simplex
and not to one to simplify the interpretation. A value of about one
means no change in the specific component. To avoid confusion the
meaningless last principal component is removed.
The plot
routine provides screeplots (type = "s"
,type=
"v"
), biplots (type = "b"
), plots of the effect of
loadings (type = "b"
) in scale.sdev*sdev
-spread, and
loadings of pairwise (log-)ratios (type = "r"
).
The interpretation of a screeplot does not differ from ordinary
screeplots. It shows the eigenvalues of the covariance matrix, which
represent the portions of variance explained by the principal
components.
The interpretation of the biplot strongly differs from a classical one.
The relevant variables are not the arrows drawn (one for each component),
but rather the links (i.e., the differences) between two
arrow heads, which represents the log-ratio between the two
components represented by the arrows.
The compositional loading plot is introduced with this
package. The loadings of all component can be seen as an orthogonal basis
in the space of clr-transformed data. These vectors are displayed by a barplot with
their corresponding composition. For a better
interpretation the total of these compositons is set to the number of
parts in the composition, such that a portion of one means no
effect. This is similar to (but not exactly the same as) a zero loading in a real
principal component analysis.
The loadings plot can work in two different modes: if
scale.sdev
is set to NA
it displays the composition
beeing represented by the unit vector of loadings in the clr-transformed space. If
scale.sdev
is numeric we use this composition scaled by the
standard deviation of the respective component.
The relative plot displays the relativeLoadings
as a
barplot. The deviation from a unit bar shows the effect of each
principal component on the respective ratio.
Aitchison, J, C. Barcel'o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn
(2002) A consise guide to the algebraic geometric structure of the
simplex, the sample space for compositional data analysis, Terra
Nostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Aitchison, J. and M. Greenacre (2002) Biplots for Compositional
Data Journal of the Royal Statistical Society, Series C (Applied Statistics)
51 (4) 375-392
clr
,acomp
, relativeLoadings
princomp.aplus
, princomp.rcomp
,
barplot.acomp
, mean.acomp
,
var.acomp
data(SimulatedAmounts)
pc <- princomp(acomp(sa.lognormals5))
pc
summary(pc)
plot(pc) #plot(pc,type="screeplot")
plot(pc,type="v")
plot(pc,type="biplot")
plot(pc,choice=c(1,3),type="biplot")
plot(pc,type="loadings")
plot(pc,type="loadings",scale.sdev=-1) # Downward
plot(pc,type="relative",scale.sdev=NA) # The directions
plot(pc,type="relative",scale.sdev=1) # one sigma Upward
plot(pc,type="relative",scale.sdev=-1) # one sigma Downward
biplot(pc)
screeplot(pc)
loadings(pc)
relativeLoadings(pc,mult=FALSE)
relativeLoadings(pc)
relativeLoadings(pc,scale.sdev=1)
relativeLoadings(pc,scale.sdev=2)
pc$Loadings
pc$DownLoadings
barplot(pc$Loadings)
pc$sdev^2
p = predict(pc,sa.lognormals5)
cov(p)
Run the code above in your browser using DataLab