Last chance! 50% off unlimited learning
Sale ends in
Plot smoothed estimates of x vs. y, handling missing data for lowess
or supsmu, and adding axis labels. Optionally suppresses plotting
extrapolated estimates. An optional group
variable can be
specified to compute and plot the smooth curves by levels of
group
. When group
is present, the datadensity
option will draw tick marks showing the location of the raw
x
-values, separately for each curve. plsmo
has an
option to plot connected points for raw data, with no smoothing. The
non-panel version of plsmo
allows y
to be a matrix, for
which smoothing is done separately over its columns. If both
group
and multi-column y
are used, the number of curves
plotted is the product of the number of groups and the number of
y
columns.
method='intervals'
is often used when y is binary, as it may be
tricky to specify a reasonable smoothing parameter to lowess
or
supsmu
in this case. The 'intervals'
method uses the
cut2
function to form intervals of x containing a target of
mobs
observations. For each interval the ifun
function
summarizes y, with the default being the mean (proportions for binary
y). The results are plotted as step functions, with vertical
discontinuities drawn with a saturation of 0.15 of the original color.
A plus sign is drawn at the mean x within each interval.
For this approach, the default x-range is the entire raw data range,
and trim
and evaluate
are ignored. For
panel.plsmo
it is best to specify type='l'
when using
'intervals'
.
panel.plsmo
is a panel
function for trellis
for the
xyplot
function that uses plsmo
and its options to draw
one or more nonparametric function estimates on each panel. This has
advantages over using xyplot
with panel.xyplot
and
panel.loess
: (1) by default it will invoke labcurve
to
label the curves where they are most separated, (2) the
datadensity
option will put rug plots on each curve (instead of a
single rug plot at the bottom of the graph), and (3) when
panel.plsmo
invokes plsmo
it can use the "super smoother"
(supsmu
function) instead of lowess
, or pass
method='intervals'
. panel.plsmo
senses when a group
variable is specified to xyplot
so
that it can invoke panel.superpose
instead of
panel.xyplot
. Using panel.plsmo
through trellis
has some advantages over calling plsmo
directly in that
conditioning variables are allowed and trellis
uses nicer fonts
etc.
When a group
variable was used, panel.plsmo
creates a function
Key
in the session frame that the user can invoke to draw a key for
individual data point symbols used for the group
s.
By default, the key is positioned at the upper right
corner of the graph. If Key(locator(1))
is specified, the key will
appear so that its upper left corner is at the coordinates of the
mouse click.
For ggplot2
graphics the counterparts are
stat_plsmo
and histSpikeg
.
plsmo(x, y, method=c("lowess","supsmu","raw","intervals"), xlab, ylab,
add=FALSE, lty=1 : lc, col=par("col"), lwd=par("lwd"),
iter=if(length(unique(y))>2) 3 else 0, bass=0, f=2/3, mobs=30, trim,
fun, ifun=mean, group, prefix, xlim, ylim,
label.curves=TRUE, datadensity=FALSE, scat1d.opts=NULL,
lines.=TRUE, subset=TRUE,
grid=FALSE, evaluate=NULL, …)
#To use panel function:
#xyplot(formula=y ~ x | conditioningvars, groups,
# panel=panel.plsmo, type='b',
# label.curves=TRUE,
# lwd = superpose.line$lwd,
# lty = superpose.line$lty,
# pch = superpose.symbol$pch,
# cex = superpose.symbol$cex,
# font = superpose.symbol$font,
# col = NULL, scat1d.opts=NULL, \dots)
vector of x-values, NAs allowed
vector or matrix of y-values, NAs allowed
"lowess"
(the default), "supsmu"
, "raw"
to not
smooth at all, or "intervals"
to use intervals (see above)
x-axis label iff add=F. Defaults of label(x) or argument name.
y-axis label, like xlab.
Set to T to call lines instead of plot. Assumes axes already labeled.
line type, default=1,2,3,…, corresponding to columns of y
and
group
combinations
color for each curve, corresponding to group
. Default is
current par("col")
.
vector of line widths for the curves, corresponding to group
.
Default is current par("lwd")
.
lwd
can also be specified as an element of label.curves
if
label.curves
is a list.
iter parameter if method="lowess"
, default=0 if y
is binary, and 3 otherwise.
bass parameter if method="supsmu"
, default=0.
passed to the lowess
function, for method="lowess"
for method='intervals'
, the target number of
observations per interval
only plots smoothed estimates between trim and 1-trim quantiles of x. Default is to use 10th smallest to 10th largest x in the group if the number of observations in the group exceeds 200 (0 otherwise). Specify trim=0 to plot over entire range.
after computing the smoothed estimates, if fun
is given the y-values
are transformed by fun()
a summary statistic function to apply to the
y
-variable for method='intervals'
. Default is mean
.
a variable, either a factor
vector or one that will be converted to
factor
by plsmo
, that is used to stratify the data so that separate
smooths may be computed
a character string to appear in group of group labels. The presence of
prefix
ensures that labcurve
will be called even when add=TRUE
.
a vector of 2 x-axis limits. Default is observed range.
a vector of 2 y-axis limits. Default is observed range.
set to FALSE
to prevent labcurve
from being called to label multiple
curves corresponding to group
s. Set to a list to pass options to
labcurve
. lty
and col
are passed to labcurve
automatically.
set to TRUE
to draw tick marks on each curve, using x-coordinates
of the raw data x
values. This is done using scat1d
.
a list of options to hand to scat1d
set to FALSE
to suppress smoothed curves from being drawn. This can
make sense if datadensity=TRUE
.
a logical or integer vector specifying a subset to use for processing, with respect too all variables being analyzed
set to TRUE
if the R grid
package drew the current plot
number of points to keep from smoother. If specified, an
equally-spaced grid of evaluate
x
values will be obtained from the
smoother using linear interpolation. This will keep from plotting an
enormous number of points if the dataset contains a very large number
of unique x
values.
optional arguments that are passed to scat1d
,
or optional parameters to pass to plsmo
from
panel.plsmo
. See optional arguments for plsmo
above.
set to p
to have panel.plsmo
plot points (and not call plsmo
),
l
to call plsmo
and not plot points, or use the default b
to plot both.
vectors of graphical parameters corresponding to the group
s (scalars
if group
is absent). By default, the parameters set up by
trellis
will be used.
plsmo
returns a list of curves (x and y coordinates) that was passed to labcurve
plots, and panel.plsmo
creates the Key
function in the session frame.
lowess
, supsmu
, label
,
quantile
, labcurve
, scat1d
,
xyplot
, panel.superpose
,
panel.xyplot
, stat_plsmo
,
histSpikeg
# NOT RUN {
set.seed(1)
x <- 1:100
y <- x + runif(100, -10, 10)
plsmo(x, y, "supsmu", xlab="Time of Entry")
#Use label(y) or "y" for ylab
plsmo(x, y, add=TRUE, lty=2)
#Add lowess smooth to existing plot, with different line type
age <- rnorm(500, 50, 15)
survival.time <- rexp(500)
sex <- sample(c('female','male'), 500, TRUE)
race <- sample(c('black','non-black'), 500, TRUE)
plsmo(age, survival.time < 1, fun=qlogis, group=sex) # plot logit by sex
#Bivariate Y
sbp <- 120 + (age - 50)/10 + rnorm(500, 0, 8) + 5 * (sex == 'male')
dbp <- 80 + (age - 50)/10 + rnorm(500, 0, 8) - 5 * (sex == 'male')
Y <- cbind(sbp, dbp)
plsmo(age, Y)
plsmo(age, Y, group=sex)
#Plot points and smooth trend line using trellis
# (add type='l' to suppress points or type='p' to suppress trend lines)
require(lattice)
xyplot(survival.time ~ age, panel=panel.plsmo)
#Do this for multiple panels
xyplot(survival.time ~ age | sex, panel=panel.plsmo)
#Repeat this using equal sample size intervals (n=25 each) summarized by
#the median, then a proportion (mean of binary y)
xyplot(survival.time ~ age | sex, panel=panel.plsmo, type='l',
method='intervals', mobs=25, ifun=median)
ybinary <- ifelse(runif(length(sex)) < 0.5, 1, 0)
xyplot(ybinary ~ age, groups=sex, panel=panel.plsmo, type='l',
method='intervals', mobs=75, ifun=mean, xlim=c(0, 120))
#Do this for subgroups of points on each panel, show the data
#density on each curve, and draw a key at the default location
xyplot(survival.time ~ age | sex, groups=race, panel=panel.plsmo,
datadensity=TRUE)
Key()
#Use wloess.noiter to do a fast weighted smooth
plot(x, y)
lines(wtd.loess.noiter(x, y))
lines(wtd.loess.noiter(x, y, weights=c(rep(1,50), 100, rep(1,49))), col=2)
points(51, y[51], pch=18) # show overly weighted point
#Try to duplicate this smooth by replicating 51st observation 100 times
lines(wtd.loess.noiter(c(x,rep(x[51],99)),c(y,rep(y[51],99)),
type='ordered all'), col=3)
#Note: These two don't agree exactly
# }
Run the code above in your browser using DataLab