Learn R Programming

rbart (version 1.0)

predict.rbart: Drawing Posterior Predictive Realizations for rbart models.

Description

The function predict.rbart() is the main function for drawing posterior predictive realizations at new inputs using a fitted model stored in a rbart object returned from rbart().

Usage

# S3 method for rbart
predict(
object,
x.test=object$x.train,
tc=1,
fmean=mean(object$y.train),
q.lower=0.025,
q.upper=0.975,...)

Arguments

object

Object of type rbart from a previous call to rbart()

x.test

New input settings in the form of an npred x p matrix at which to construct predictions. Defaults to the training inputs.

tc

Number of OpenMP threads to use for parallel computing.

fmean

Mean-centering vector for the training data. Defaults to the value used by rbart() when fitting the model. Usually should be left to the default.

q.lower

Lower quantile to return.

q.upper

Upper quantile to return.

...

not used.

Value

mdraws

Posterior realizations of the mean function, \(f({\bf x})\) stored in an ndpost x npred matrix, where ndpost is the number of kept MCMC draws in the rbart run.

sdraws

Posterior realizations of the standard deviation function, \(s({\bf x})\) stored in an ndpost x npred matrix, where ndpost is the number of kept MCMC draws in the rbart run.

mmean

Posterior predictive mean of \(f({\bf x})\).

smean

Posterior predictive mean of the standard deviation, \(s({\bf x})\).

msd

Posterior standard deviation of the mean, \(f({\bf x})\).

ssd

Posterior standard deviation of the standard devation, \(s({\bf x})\).

m.5

Posterior median of the mean function realizations, \(f({\bf x})\).

m.lower

Posterior q.lower quantile of the mean function realizations.

m.upper

Posterior q.upper quantile of the mean function realizations.

s.5

Posterior median of the standard deviation function realizations, \(s({\bf x})\).

s.lower

Posterior q.lower quantile of the standard deviation function realizations.

s.upper

Posterior q.upper quantile of the standard deviation function realizations.

q.lower

Lower quantile used in constructing the above.

q.upper

Upper quantile used in constructing the above.

Details

predict.rbart() is the main function for calculating posterior predictions and uncertainties once a model has been fit by rbart().

Returns an object of type rbart with the following entries.

References

Chipman, Hugh A., George, Edward I., and McCulloch, Robert E. (1998) Bayesian CART model search. Journal of the American Statistical Association, 93, 935--948.

Chipman, Hugh A., George, Edward I., and McCulloch, Robert E. (2010) BART: Bayesian additive regression trees. The Annals of Applied Statistics, 4, 266--298.

Pratola, Matthew T. (2016) Efficient Metropolis Hastings proposal mechanisms for Bayesian regression tree models. Bayesian analysis, 11, 885--911.

Pratola, Matthew T., Chipman, Hugh A., George, Edward I. and McCulloch, Robert E. (2017) Heteroscedastic BART Using Multiplicative Regression Trees. arXiv preprint, arXiv:1709.07542, 1--20.

See Also

rbart

Examples

Run this code
# NOT RUN {
##################################################
## This is just a stub (runs fast) example for testing.
##  For more realistic examples, please see:
##   (i) the vignette at www.rob-mcculloch.org
##   (ii) the example simulated data (see ?simdat)
##        and the longer run in ?rbartonsimd, 
##        where a saved run of rbart is run on simdat is plotted. 
##################################################

##simulate data
set.seed(99)

# train data
n=500 #train data sample size
p=1 #just one x
x = matrix(sort(runif(n*p)),ncol=p) #iid uniform x values
fx = 4*(x[,1]^2) #quadratric function f
sx = .2*exp(2*x[,1]) # exponential function s
y = fx + sx*rnorm(n) # y = f(x) + s(x) Z

#test data (the p added to the variable names is for predict)
np=500 #test data sample size
xp = matrix(sort(runif(np*p)),ncol=p)
fxp = 4*(xp[,1]^2)
sxp = .2*exp(2*xp[,1])
yp = fxp + sxp*rnorm(np)

##run rbart MCMC
# The number of interations is kept small to make example run,
##!!!!  REAL APPLICATIONS MAY NEED LONGER RUNS !!!!
#   nskip: burn in draws,
#   ndpost:kept draws,
#   nadapt: initial draws to tune MCMC,
#   numcut: number of cutpoints used for each x
#   k: bigger k gives smoother f (default is 2)
set.seed(19)
res = rbart(x,y,nskip=10,ndpost=20,nadapt=0,numcut=1000,k=5) #again, this is way too short a run!!!
## now predict to get inference
resp = predict(res,x.test=xp)

##check out of sample fit
cat("out of sample cor(f,fhat) is ",cor(fxp,resp$mmean),"\n")
cat("out of sample cor(s,shat) is ",cor(sxp,resp$smean),"\n")

##plot estimated vs. true
##plot the data
plot(xp,yp,cex.axis=1.5,cex.lab=1.5)
lines(xp,fxp,col="blue")
lines(xp,fx+2*sxp,col="blue",lty=2)
lines(xp,fxp-2*sxp,col="blue",lty=2)

## add the fit
lines(xp,resp$mmean) #estimate of f
lines(xp,resp$mmean+2*resp$smean) #estimate of sd
lines(xp,resp$mmean-2*resp$smean) #estimate of sd
# }

Run the code above in your browser using DataLab