Enhanced scatterplot matrices with univariate displays down the diagonal;
spm
is an abbreviation for scatterplotMatrix
.
This function just sets up a call to pairs
with custom panel functions.
scatterplotMatrix(x, ...)# S3 method for formula
scatterplotMatrix(formula, data=NULL, subset, labels, ...)
# S3 method for default
scatterplotMatrix(x, var.labels=colnames(x),
diagonal=c("density", "boxplot", "histogram", "oned", "qqplot", "none"),
adjust=1, nclass,
plot.points=TRUE, smoother=loessLine, smoother.args=list(), smooth, span,
spread = !by.groups, reg.line=lm,
transform=FALSE, family=c("bcPower", "yjPower"),
ellipse=FALSE, levels=c(.5, .95), robust=TRUE,
groups=NULL, by.groups=FALSE,
use=c("complete.obs", "pairwise.complete.obs"),
labels, id.method="mahal", id.n=0, id.cex=1, id.col=palette()[1], id.location="lr",
col=if (n.groups == 1) palette()[3:1] else rep(palette(), length=n.groups),
pch=1:n.groups, lwd=1, lty=1,
cex=par("cex"), cex.axis=par("cex.axis"), cex.labels=NULL,
cex.main=par("cex.main"),
legend.plot=length(levels(groups)) > 1, legend.pos=NULL, row1attop=TRUE, ...)
spm(x, ...)
a data matrix, numeric data frame.
a one-sided “model” formula, of the form
~ x1 + x2 + ... + xk
or ~ x1 + x2 + ... + xk | z
where z
evaluates to a factor or other variable to divide the data into groups.
for scatterplotMatrix.formula
,
a data frame within which to evaluate the formula.
expression defining a subset of observations.
Arguments for the labelling of
points. The default is id.n=0
for labeling no points. See
showLabels
for details of these arguments. If the plot uses
different colors for groups, then the id.col
argument is ignored and
label colors are determined by the col
argument.
variable labels (for the diagonal of the plot).
contents of the diagonal panels of the plot. If plotting by groups, a different
univariate display (with the exception of "histogram"
) will be drawn for each group.
relative bandwidth for density estimate, passed to
density
function.
number of bins for histogram, passed to hist
function.
if TRUE
the points are plotted in each
off-diagonal panel.
a function to draw a nonparametric-regression smooth; the default is gamLine
, which
uses the gam
function in the mgcv package. For this and other smoothers,
see ScatterplotSmoothers
.
Setting this argument to something other than a function, e.g., FALSE
suppresses the smoother.
a list of named values to be passed to the smoother function; the specified elements of the
list depend upon the smoother (see ScatterplotSmoothers
).
these arguments are included for backwards compatility: if smooth=TRUE
then smoother
is set to loessLine
,
and if span
is specified, it is added to smoother.args
.
if TRUE, estimate the (square root) of the variance function. For loessLine
and
for gamLine
, this is done by separately smoothing the squares of the postive and negative
residuals from the mean fit, and then adding the square root of the fitted values to the mean fit. For
quantregLine
, fit the .25 and .75 quantiles with a quantile regression additive model.
The default is TRUE if by.groups=FALSE
and FALSE is by.groups=TRUE
.
if not FALSE
a line is plotted using the
function given by this argument; e.g., using rlm
in
package MASS
plots a robust-regression line.
if TRUE
, multivariate normalizing power transformations
are computed with powerTransform
, rounding the estimated powers to `nice' values for plotting;
if a vector of powers, one for each variable, these are applied prior to plotting. If there are groups
and by.groups
is TRUE
, then the transformations are estimated conditional on the
groups
factor.
family of transformations to estimate: "bcPower"
for the Box-Cox family or
"yjPower"
for the Yeo-Johnson family (see powerTransform
).
if TRUE
data-concentration ellipses are plotted in
the off-diagonal panels.
levels or levels at which concentration ellipses are plotted;
the default is c(.5, .9)
.
if TRUE
use the cov.trob
function in the MASS
package
to calculate the center and covariance matrix for the data ellipses.
a factor or other variable dividing the data into groups; groups are plotted with different colors and plotting characters.
if TRUE
, regression lines are fit by groups.
if "complete.obs"
(the default), cases with missing data are omitted; if "pairwise.complete.obs"), all valid cases are used
in each panel of the plot.
plotting characters for points; default is the plotting characters in
order (see par
).
colors for lines and points; the default is taken from the color palette,
with palette()[3]
for linear regression lines, palette()[2]
for nonparametric regression lines, and palette()[1]
for points if there are
no groups, and successive colors for the groups if there are groups.
width of linear-regression lines (default 1
).
type of linear-regression lines (default 1
, solid line).
set sizes of various graphical elements
(see par
).
if TRUE
then a legend for the groups is plotted
in the first diagonal cell.
position for the legend, specified as one of the keywords accepted by
legend
. If NULL
, the default, the position will vary by the
diagonal
argument --- e.g., "topright"
for diagonal="density"
.
If TRUE
(the default) the first row is at the top, as in a matrix, as
opposed to at the bottom, as in graph (argument suggested by Richard Heiberger).
arguments to pass down.
NULL
. This function is used for its side effect: producing
a plot.
Fox, J. and Weisberg, S. (2011) An R Companion to Applied Regression, Second Edition, Sage.
pairs
, scatterplot
,
dataEllipse
, powerTransform
,
bcPower
, yjPower
, cov.trob
,
showLabels
, ScatterplotSmoothers
.
scatterplotMatrix(~ income + education + prestige | type, data=Duncan)
scatterplotMatrix(~ income + education + prestige,
transform=TRUE, data=Duncan, smoother=loessLine)
scatterplotMatrix(~ income + education + prestige | type, smoother=FALSE,
by.group=TRUE, transform=TRUE, data=Duncan)
Run the code above in your browser using DataLab