Predicted values and intervals based on a fitted model object.
# S3 method for splm
predict(
object,
newdata,
se.fit = FALSE,
scale = NULL,
df = Inf,
interval = c("none", "confidence", "prediction"),
level = 0.95,
type = c("response", "terms"),
local,
terms = NULL,
na.action = na.fail,
...
)# S3 method for spautor
predict(
object,
newdata,
se.fit = FALSE,
scale = NULL,
df = Inf,
interval = c("none", "confidence", "prediction"),
level = 0.95,
type = c("response", "terms"),
local,
terms = NULL,
na.action = na.fail,
...
)
# S3 method for splm_list
predict(
object,
newdata,
se.fit = FALSE,
scale = NULL,
df = Inf,
interval = c("none", "confidence", "prediction"),
level = 0.95,
type = c("response", "terms"),
local,
terms = NULL,
na.action = na.fail,
...
)
# S3 method for spautor_list
predict(
object,
newdata,
se.fit = FALSE,
scale = NULL,
df = Inf,
interval = c("none", "confidence", "prediction"),
level = 0.95,
type = c("response", "terms"),
local,
terms = NULL,
na.action = na.fail,
...
)
# S3 method for splmRF
predict(object, newdata, local, ...)
# S3 method for spautorRF
predict(object, newdata, local, ...)
# S3 method for splmRF_list
predict(object, newdata, local, ...)
# S3 method for spautorRF_list
predict(object, newdata, local, ...)
# S3 method for spglm
predict(
object,
newdata,
type = c("link", "response", "terms"),
se.fit = FALSE,
interval = c("none", "confidence", "prediction"),
level = 0.95,
dispersion = NULL,
terms = NULL,
local,
var_correct = TRUE,
newdata_size,
na.action = na.fail,
...
)
# S3 method for spgautor
predict(
object,
newdata,
type = c("link", "response", "terms"),
se.fit = FALSE,
interval = c("none", "confidence", "prediction"),
level = 0.95,
dispersion = NULL,
terms = NULL,
local,
var_correct = TRUE,
newdata_size,
na.action = na.fail,
...
)
# S3 method for spglm_list
predict(
object,
newdata,
type = c("link", "response", "terms"),
se.fit = FALSE,
interval = c("none", "confidence", "prediction"),
level = 0.95,
dispersion = NULL,
terms = NULL,
local,
var_correct = TRUE,
newdata_size,
na.action = na.fail,
...
)
# S3 method for spgautor_list
predict(
object,
newdata,
type = c("link", "response", "terms"),
se.fit = FALSE,
interval = c("none", "confidence", "prediction"),
level = 0.95,
dispersion = NULL,
terms = NULL,
local,
var_correct = TRUE,
newdata_size,
na.action = na.fail,
...
)
For splm
or spautor
objects, if se.fit
is FALSE
, predict()
returns
a vector of predictions or a matrix of predictions with column names
fit
, lwr
, and upr
if interval
is "confidence"
or "prediction"
. If se.fit
is TRUE
, a list with the following components is returned:
fit
: vector or matrix as above
se.fit
: standard error of each fit
For splm_list
or spautor_list
objects, a list that contains relevant quantities for each
list element.
For splmRF
or spautorRF
objects, a vector of predictions. For splmRF_list
or spautorRF_list
objects, a list that contains relevant quantities for each list element.
A fitted model object.
A data frame or sf
object in which to
look for variables with which to predict. If a data frame, newdata
must contain all variables used by formula(object)
and all variables
representing coordinates. If an sf
object, newdata
must contain
all variables used by formula(object)
and coordinates are obtained
from the geometry of newdata
. If omitted, missing data from the
fitted model object are used.
A logical indicating if standard errors are returned.
The default is FALSE
.
A numeric constant by which to scale the regular standard errors and intervals.
Similar to but slightly different than scale
for stats::predict.lm()
, because
predictions form a spatial model may have different residual variances for each
observation in newdata
. The default is NULL
, which returns
the regular standard errors and intervals.
Degrees of freedom to use for confidence or prediction intervals
(ignored if scale
is not specified). The default is Inf
.
Type of interval calculation. The default is "none"
.
Other options are "confidence"
(for confidence intervals) and
"prediction"
(for prediction intervals). When interval
is "none"
or "prediction"
, predictions are returned (and when
requested, their corresponding uncertainties). When interval
is "confidence"
, mean estimates are returned (and when
requested, their corresponding uncertainties). This "none"
behavior
differs from that of lm()
, as lm()
returns confidence
uncertainties (in .$se.fit
).
Tolerance/confidence level. The default is 0.95
.
The prediction type, either on the response scale, link scale (only for
spglm()
or spgautor()
model objects), or terms scale.
A optional logical or list controlling the big data approximation. If omitted, local
is set to TRUE
or FALSE
based on the observed data sample size (i.e., sample size of the fitted
model object) -- if the sample size exceeds 10,000, local
is
set to TRUE
, otherwise it is set to FALSE
. This default behavior
occurs because main computational
burden of the big data approximation depends almost exclusively on the
observed data sample size, not the number of predictions desired
(which we feel is not intuitive at first glance).
If local
is FALSE
, no big data approximation
is implemented. If a list is provided, the following arguments detail the big
data approximation:
method
: The big data approximation method. If method = "all"
,
all observations are used and size
is ignored. If method = "distance"
,
the size
data observations closest (in terms of Euclidean distance)
to the observation requiring prediction are used.
If method = "covariance"
, the size
data observations
with the highest covariance with the observation requiring prediction are used.
If random effects and partition factors are not used in estimation and
the spatial covariance function is monotone decreasing,
"distance"
and "covariance"
are equivalent. The default
is "covariance"
. Only used with models fit using splm()
or spglm()
.
size
: The number of data observations to use when method
is "distance"
or "covariance"
. The default is 100. Only used
with models fit using splm()
or spglm()
.
parallel
: If TRUE
, parallel processing via the
parallel package is automatically used. This can significantly speed
up computations even when method = "all"
(i.e., no big data
approximation is used), as predictions
are spread out over multiple cores. The default is FALSE
.
ncores
: If parallel = TRUE
, the number of cores to
parallelize over. The default is the number of available cores on your machine.
When local
is a list, at least one list element must be provided to
initialize default arguments for the other list elements.
If local
is TRUE
, defaults for local
are chosen such
that local
is transformed into
list(size = 100, method = "covariance", parallel = FALSE)
.
If type
is "terms"
, the type of terms to be returned,
specified via either numeric position or name. The default is all terms are included.
Missing (NA
) values in newdata
will return an error and should
be removed before proceeding.
Other arguments. Only used for models fit using splmRF()
or spautorRF()
where ...
indicates other
arguments to ranger::predict.ranger()
.
The dispersion of assumed when computing the prediction standard errors
for spglm()
or spgautor()
model objects when family
is "nbinomial"
, "beta"
, "Gamma"
, or "inverse.gaussian"
.
If omitted, the model object dispersion parameter is used.
A logical indicating whether to return the corrected prediction
variances when predicting via models fit using spglm()
or spgautor()
. The default is
TRUE
.
The size
value for each observation in newdata
used when predicting for the binomial family.
For splm
and spautor
objects, the (empirical) best linear unbiased predictions (i.e., Kriging
predictions) at each site are returned when interval
is "none"
or "prediction"
alongside standard errors. Prediction intervals
are also returned if interval
is "prediction"
. When
interval
is "confidence"
, the estimated mean is returned
alongside standard errors and confidence intervals for the mean. For splm_list
and spautor_list
objects, predictions and associated intervals and standard errors are returned
for each list element.
For splmRF
or spautorRF
objects, random forest spatial residual
model predictions are computed by combining the random forest prediction with
the (empirical) best linear unbiased prediction for the residual. Fox et al. (2020)
call this approach random forest regression Kriging. For splmRF_list
or spautorRF
objects,
predictions are returned for each list element.
Fox, E.W., Ver Hoef, J. M., & Olsen, A. R. (2020). Comparing spatial regression to random forests for large environmental data sets. PloS one, 15(3), e0229509.
spmod <- splm(sulfate ~ 1,
data = sulfate,
spcov_type = "exponential", xcoord = x, ycoord = y
)
predict(spmod, sulfate_preds)
predict(spmod, sulfate_preds, interval = "prediction")
augment(spmod, newdata = sulfate_preds, interval = "prediction")
# \donttest{
sulfate$var <- rnorm(NROW(sulfate)) # add noise variable
sulfate_preds$var <- rnorm(NROW(sulfate_preds)) # add noise variable
sprfmod <- splmRF(sulfate ~ var, data = sulfate, spcov_type = "exponential")
predict(sprfmod, sulfate_preds)
# }
# \donttest{
spgmod <- spglm(presence ~ elev * strat,
family = "binomial",
data = moose,
spcov_type = "exponential"
)
predict(spgmod, moose_preds)
predict(spgmod, moose_preds, interval = "prediction")
augment(spgmod, newdata = moose_preds, interval = "prediction")
# }
Run the code above in your browser using DataLab