Learn R Programming

shipunov (version 1.17.1)

Biarrows: Adds correlation arrows to the scatterplot

Description

Plots 'orig' variables as arrows on the 'deriv' variables 2D scatterplot

Usage

Biarrows(deriv, orig, coeffs=NULL, shrink=0.45, closer=0.9,
 pt.col="forestgreen", pt.cex=1, pt.pch=NA, tx=colnames(orig),
 tx.col="forestgreen", tx.cex=0.8, tx.font=1, tx.pos=NULL, tx.off=0.5, xpd=TRUE,
 ar.col="forestgreen", ar.len=0.05, shift="auto", ...)

Arguments

deriv

Data derived from, e.g., dimension reducion of 'orig'

orig

Original data

coeffs

(Optional) two-column matrix with proposed coordinates of arrow tips, row names must represent 'orig' variables

shrink

How to shrink arrows in relation to 'deriv' ranges, default is 45% (0.45)

closer

How closer to the center (in relation to the text label) is the arrow tip, default is 0.9

pt.col

Color of points, default is "forestgreen"

pt.cex

Size of points, default is 1

pt.pch

Type of points, default is NA (no points)

tx

Text labels, default are 'colnames(orig)'

tx.col

Color of text labels, default is "forestgreen"

tx.cex

Size of text, default is 0.8

tx.font

Font of text, default is 1 (plain)

tx.pos

Position of text, default is NULL (in the center)

tx.off

Offest for text labels, default 0.5 (works only if 'tx.pos' is not NULL)

xpd

Allow text to go outside of plotting region?

ar.col

Color of arrows, default is "forestgreen"

ar.len

Length of the edges of the arrow head (in inches)

shift

Shift from the center which is c(0, 0); default is "auto" which is colMeans(deriv)

...

Further arguments to arrows()

Author

Alexey Shipunov

Details

Biarrows() calculates correlations between two sets of variables which generally belong to the same data: more then one 'orig' variables and exactly two 'deriv' variables. These correlations might be understood as importances of the 'orig' variables. Then Biarrows() scales correlations to the 'deriv' ranges and adds text labels and arrows (possibly also points) to the scatterplot of derived variables. These arrows represent the original variables in relation with derived variables. Resulted plot may be seen as a biplot which simultaneously shows two sets of variables. In fact, it is possible to show three and more sets of variables (see examples).

This approach might work for data derived from (almost) any kind of dimensional reduction. Biarrows() is also much more flexible than standard biplot(). Please note, however, that Biarrows() is only visualization, and numerical conclustions might not be justified.

If 'deriv' data contains more then 2 variables, the rest will be discarded. Both 'deriv' and 'orig' should be either data frames or matrices with column names and compatible dimensions, possibly with NAs.

Biarrows(dr, coeffs=...) allows to use pre-calculated coefficients. In that case, 'data' is ignored (except for column names, but they might be supplied separately as 'tx' value), and 'coeffs' will be scaled. See examples to understand better how it works.

To suppress arrows or text, use zero color. Points are suppressed by default.

See Also

Examples

Run this code

iris.cmd <- cmdscale(dist(iris[, -5]))
plot(iris.cmd, xlab="Dim 1", ylab="Dim 2")
Biarrows(iris.cmd, iris[, -5])
title(main="MDS biplot with Biarrows()")

## ===

library(MASS)
iris.mds <- isoMDS(dist(unique(iris[, -5])))
plot(iris.mds$points, xlab="Dim 1", ylab="Dim 2")
Biarrows(iris.mds$points, unique(iris[, -5]))
title(main="Non-metric MDS biplot with Biarrows()")

## ===

library(MASS)
iris.smm <- sammon(dist(unique(iris[, -5])))
plot(iris.smm$points, xlab="Dim 1", ylab="Dim 2")
Biarrows(iris.smm$points, unique(iris[, -5]))
title(main="Sammon mapping biplot with Biarrows()")

## ===

iris.p <- prcomp(iris[, -5], scale=TRUE)
biplot(iris.p, xpd=TRUE, main="Original PCA biplot")
plot(iris.p$x)
Biarrows(iris.p$x, iris[, -5])
title(main="PCA biplot with Biarrows()")

## ===

plot(iris.p$x, xlab="PCA1", ylab="PCA2")
## how to use 'coeffs'
## they also useful as surrogates of variable importances
(coeffs <- cor(iris[, -5], iris.p$x, method="spearman"))
Biarrows(iris.p$x, tx=rownames(coeffs), coeffs=coeffs)

## ===

plot(iris[, c(1, 3)])
Biarrows(iris[, c(1, 3)], iris.p$x)
title(main="\"Reversed biplot\"")

## ===

plot(iris[, c(1, 3)])
Biarrows(iris[, c(1, 3)], iris[, c(2, 4)])
title(main="Iris flowers: lengths vs. widths")

## ===

plot(iris.p$x)
Biarrows(iris.p$x[, 1:2], iris.p$x[, 1:2])
title(main="\"Self-biplot\" on PCA")

## ===

library(MASS)
iris.ldap <- predict(lda(Species ~ ., data=iris), iris[, -5])
plot(iris.ldap$x)
Biarrows(iris.ldap$x, iris[, -5])
Biarrows(iris.ldap$x, iris.p$x[, 1:2], shift=c(9, 2.5),
 shrink=0.95, lty=2, ar.col="darkgrey", tx.col="darkgrey")
title(main="Triplot: LDA, original variables and PCA axes")

## ===

iris.cl <- Classproj(iris[, -5], iris$Species)
plot(iris.cl$proj, col=iris$Species)
Biarrows(iris.cl$proj, iris[, -5])
title(main="Classproj biplot")

Run the code above in your browser using DataLab