Learn R Programming

bigrf (version 0.1-12)

proximities-methods: Compute Proximity Matrix

Description

Compute the proximity matrix for a random forest, for the nnearest most proximate examples to each training example.

Usage

"proximities"(forest, nnearest=forest@nexamples, cachepath=tempdir(), trace=0L)

Arguments

forest
A random forest of class "bigcforest".
nnearest
The number of most proximate examples for which to compute proximity measures for each training example. Setting this to a smaller number will speed up computation of scaling co-ordinates. Default: forest@nexamples.
cachepath
Path to folder where the proximity matrix can be stored. If NULL, then the big.matrix's will be created in memory with no disk caching, which would be suitable for small data sets. If the user wishes to reuse the cached files, it is suggested that a folder other than tempdir() is used, as the operating system may automatically delete any cache files in tempdir(). Default: tempdir().
trace
0 for no verbose output. 1 to print verbose output. 2 to print even more verbose output on processing of each tree and example. Default: 0.

Value

An object of class "bigrfprox" containing the proximity matrix.

Methods

signature(forest = "bigcforest")
Compute the proximity matrix for a classification random forest.

References

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

Breiman, L. & Cutler, A. (n.d.). Random Forests. Retrieved from http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm.

Examples

Run this code
# Classify cars in the Cars93 data set by type (Compact, Large,
# Midsize, Small, Sporty, or Van).

# Load data.
data(Cars93, package="MASS")
x <- Cars93
y <- Cars93$Type

# Select variables with which to train model.
vars <- c(4:22)

# Run model, grow 30 trees.
forest <- bigrfc(x, y, ntree=30L, varselect=vars, cachepath=NULL)

# Calculate proximity matrix.
prox <- proximities(forest, cachepath=NULL)

Run the code above in your browser using DataLab