Learn R Programming

PivotalR (version 0.1.18.5)

predict.dt.madlib: Compute the predictions of the model produced by madlib.rpart

Description

This is actually a wrapper for MADlib's predict function of decision tree. It accepts the result of madlib.rpart, which is a representation of decision tree, and compute the predictions for new data sets.

Usage

# S3 method for dt.madlib
predict(object, newdata, type = c("response", "prob"),
    ...)

Arguments

object

A dt.madlib object, which is the result of madlib.rpart.

newdata

A '>db.obj object, which contains the data used for prediction. If it is not given, then the data set used to train the model will be used.

type

A string, default is "response". For regessions, this will generate the fitting values. For classification, this will generate the predicted class values. There is an extra option "prob" for classification tree, which computes the probabilities of each class.

...

Other arguments. Not implemented yet.

Value

A '>db.obj object, which wraps a table that contains the predicted values and also a valid ID column. For type='response', the predicted column has the fitted value (regression tree) or the predicted classes (classification tree). For type='prob', there are one column for each class, which contains the probabilities for that class.

References

[1] Documentation of decision tree in MADlib 1.6, https://madlib.apache.org/docs/latest/

See Also

madlib.lm, madlib.glm, madlib.rpart, madlib.summary, madlib.arima, madlib.elnet are all MADlib wrapper functions.

predict.lm.madlib, predict.logregr.madlib, predict.elnet.madlib, predict.arima.css.madlib are all predict functions related to MADlib wrapper functions.

Examples

Run this code
# NOT RUN {
<!-- %% @test .port Database port number -->
<!-- %% @test .dbname Database name -->
## set up the database connection
## Assume that .port is port number and .dbname is the database name
cid <- db.connect(port = .port, dbname = .dbname, verbose = FALSE)

x <- as.db.data.frame(abalone, conn.id = cid, verbose = FALSE)

key(x) <- "id"
fit <- madlib.rpart(rings < 10 ~ length + diameter + height + whole + shell,
       data=x, parms = list(split='gini'), control = list(cp=0.005))

predict(fit, x, 'r')

db.disconnect(cid)
# }

Run the code above in your browser using DataLab