Plot the distribution of the predicted values for each class.
Can be used for earth
models, but also for models built by
lm
,
glm
,
lda
,
etc.
plotd(object, hist = FALSE, type = NULL, nresponse = NULL, dichot = FALSE,
trace = FALSE, xlim = NULL, ylim = NULL, jitter = FALSE, main=NULL,
xlab = "Predicted Value", ylab = if(hist) "Count" else "Density",
lty = 1, col = c("gray70", 1, "lightblue", "brown", "pink", 2, 3, 4),
fill = if(hist) col[1] else 0,
breaks = "Sturges", labels = FALSE,
kernel = "gaussian", adjust = 1, zero.line = FALSE,
legend = TRUE, legend.names = NULL, legend.pos = NULL,
cex.legend = .8, legend.bg = "white", legend.extra = FALSE,
vline.col = 0, vline.thresh = .5, vline.lty = 1, vline.lwd = 1,
err.thresh = vline.thresh, err.col = 0, err.border = 0, err.lwd = 1,
xaxt = "s", yaxt = "s", xaxis.cex = 1, sd.thresh = 0.01, ...)
To start off, look at the arguments object
, hist
, type
.
For predict methods with multiple column responses, see the nresponse
argument.
For factor responses with more than two levels, see the dichot
argument.
Model object. Typically a model which predicts a class or a class discriminant.
FALSE
(default) to call density
internally.
TRUE
to call hist
internally.
Type parameter passed to predict
.
For allowed values see the predict
method for
your object
(such as predict.earth
).
By default, plotd
tries to automatically select a suitable
value for the model in question.
(This is "response"
for all objects except rpart
models, where "vector"
is used. The choices will often be inappropriate.)
Typically you would set hist=TRUE
when type="class"
.
Which column to use when predict
returns multiple columns.
This can be a column index or column name
(which may be abbreviated, partial matching is used).
The default is NULL
, meaning use all columns of the predicted response.
Dichotimise the predicted response.
This argument is ignored except for models where the observed response
is a factor with more than two levels
and the predicted response is a numeric vector.
The default FALSE
separates the response into a group for each factor.
With dichot=TRUE
the response is separated into just two groups:
the first level of the factor versus the remaining levels.
Default FALSE
.
Use TRUE
or 1
to trace plotd
---
useful to see how plotd
partitions the predicted response into classes.
Use 2
for more details.
Limits of the x axis.
The default NULL
means determine these limits automatically,
else specify c(xmin,xmax)
.
Limits of the y axis.
The default NULL
means determine these limits automatically,
else specify c(ymin,ymax)
.
Jitter the histograms or densities horizontally to minimize overplotting.
Default FALSE
.
Specify TRUE
to automatically calculate the jitter,
else specify a numeric jitter value.
Main title. Values:
"string"
string
""
no title
NULL
(default) generate a title from the call.
x axis label.
Default is "Predicted Value"
.
y axis label.
Default is if(hist) "Count" else "Density"
.
Per class line types for the plotted lines. Default is 1 (which gets recycled for all lines).
Per class line colors. The first few colors of the default are intended to be easily distinguishable on both color displays and monochrome printers.
Fill color for the plot for the first class.
For hist=FALSE
, the default is 0, i.e., no fill.
For hist=TRUE
, the default is the first element in the col
argument.
Passed to hist
.
Only used if hist=TRUE
.
Default is "Sturges"
.
When type="class"
, setting breaks
to a low number
can be used to widen the histogram bars
TRUE
to draw counts on the hist
plot.
Only used if hist=TRUE
.
Default is FALSE
.
Passed to density
.
Only used if hist=FALSE
.
Default is "gaussian"
.
Passed to density
.
Only used if hist=FALSE
.
Default is 1
.
Passed to plot.density
.
Only used if hist=FALSE
.
Default is FALSE
.
TRUE
(default) to draw a legend, else FALSE
.
Class names in legend.
The default NULL
means determine these automatically.
Position of the legend.
The default NULL
means position the legend automatically,
else specify c(x,y)
.
cex
for legend
.
Default is .8
.
bg
color for legend
.
Default is "white"
.
Show (in the legend) the number of occurrences of each class.
Default is FALSE
.
Horizontal position of optional vertical line.
Default is 0.5
.
The vertical line is intended to indicate class separation.
If you use this, don't forget to set vline.col
.
Color of vertical line. Default is 0, meaning no vertical line.
Line type of vertical line.
Default is 1
.
Line width of vertical line.
Default is 1
.
x axis value specifying the error shading threshold.
See err.col
.
Default is vline.thresh
.
Specify up to three colors to shade the "error areas" of the density plot.
The default is 0
, meaning no error shading.
This argument is ignored unless hist=FALSE
.
If there are more than two classes, err.col
uses only the first two.
This argument is best explained by running an example:
data(etitanic)
earth.mod <- earth(survived ~ ., data=etitanic)
plotd(earth.mod, vline.col=1, err.col=c(2,3,4))
The three areas are (i) the error area to the left of the threshold,
(ii) the error area to the right of the threshold, and,
(iii) the reducible error area.
If less than three values are specified, plotd
re-uses values in a sensible manner.
Use values of 0
to skip areas.
Disjoint regions are not handled well by the current implementation.
Borders around the error shading.
Default is 0
, meaning no borders, else specify up to three colors.
Line widths of borders of the error shading.
Default is 1
, else specify up to three line widths.
Default is "s"
.
Use xaxt="n"
for no x axis.
Default is "s"
.
Use yaxt="n"
for no y axis.
Only used if hist=TRUE
and type="class"
.
Specify size of class labels drawn on the x axis.
Default is 1.
Minimum acceptable standard deviation for a density.
Default is 0.01
.
Densities with a standard deviation less than sd.thresh
will not be plotted (a warning will be issued and the legend
will say "not plotted"
).
Extra arguments passed to the predict method for the object.
density
, plot.density
hist
, plot.histogram
earth
, plot.earth
if (require(earth)) {
old.par <- par(no.readonly=TRUE);
par(mfrow=c(2,2), mar=c(4, 3, 1.7, 0.5), mgp=c(1.6, 0.6, 0), cex = 0.8)
data(etitanic)
mod <- earth(survived ~ ., data=etitanic, degree=2, glm=list(family=binomial))
plotd(mod)
plotd(mod, hist=TRUE, legend.pos=c(.25,220))
plotd(mod, hist=TRUE, type="class", labels=TRUE, xlab="", xaxis.cex=.8)
par(old.par)
}
Run the code above in your browser using DataLab