bigrfc(x, y, ntrees = 50L, varselect = NULL, varnlevels = NULL, nsplitvar = round(sqrt(ifelse(is.null(varselect), ncol(x), length(varselect)))), maxeslevels = 11L, nrandsplit = 1023L, maxndsize = 1L, yclasswts = NULL, printerrfreq = 10L, printclserr = TRUE, cachepath = tempdir(), trace = 0L)
big.matrix
, matrix
or data.frame
of predictor variables. If a matrix
or data.frame
is specified, it will be converted into a big.matrix
for computation.grow
. Default: 50.x
to use. If not specified, all variables will be used.x
does not contain levels information (i.e. x
is a matrix
or big.matrix
). If x
is a data.frame
, varnlevels
will be inferred from x
. If x
is not a data.frame
and varnlevels
is NULL
, all variables will be treated as numeric. If all columns of x
are used, varnlevels
should have as many elements as there are columns of x
. But if varselect is specified, then varnlevels
and varselect
should be of the same length.varselect
is specified, the square root of the number of variables specified; otherwise, the square root of the number of columns of x
.NULL
if all classes should be weighted equally.TRUE
for error estimates for individual classes to be printed, in addition to the overall error estimates. Default: TRUE
.NULL
, then the big.matrix
's will be created in memory with no disk caching, which would be suitable for small data sets. If caching is used, some of the cached files can be reused in other methods like varimp
, shortening method initialization time. If the user wishes to reuse the cached files in this manner, it is suggested that a folder other than tempdir()
is used, as the operating system may automatically delete any cache files in tempdir()
. Default: tempdir()
.0
for no verbose output. 1
to print verbose output on growing of trees. 2
to print more verbose output on processing of individual nodes. Default: 0
. Due to the way %dopar%
handles the output of the tree-growing iterations, you may not see the verbose output in some GUIs like RStudio. For best results, run R from the command line in order to see all the verbose output."bigcforest"
containing the specified number of trees, which are objects of class "bigctree"
.
Breiman, L. & Cutler, A. (n.d.). Random Forests. Retrieved from http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm.
randomForest
cforest
# Classify cars in the Cars93 data set by type (Compact, Large,
# Midsize, Small, Sporty, or Van).
# Load data.
data(Cars93, package="MASS")
x <- Cars93
y <- Cars93$Type
# Select variables with which to train model.
vars <- c(4:22)
# Run model, grow 30 trees.
forest <- bigrfc(x, y, ntree=30L, varselect=vars, cachepath=NULL)
Run the code above in your browser using DataLab