expectation_convex: Convex expectation

Description

Generates expectation hypervolume corresponding to a convex hull (polytope) that minimally encloses the data.

Usage

expectation_convex(input, point.density = NULL, num.samples = NULL,
                 num.points.on.hull = NULL, check.memory = TRUE,
                 verbose = TRUE, use.random = FALSE, method =
                 "hitandrun", chunksize = 1000)

Arguments

input

A m x n matrix or data frame, where m is the number of observations and n is the dimensionality.

point.density

The point density of the output expectation. If NULL, defaults to v / num.points where d is the dimensionality of the input and v is the volume of the hypersphere.

num.samples

The number of points in the output expectation. If NULL, defaults to 10^(3+sqrt(ncol(d))) where d is the dimensionality of the input. num.points has priority over point.density; both cannot be specified.

num.points.on.hull

Number of points of the input used to calculate the convex hull. Larger values are more accurate but may lead to slower runtimes. If NULL, defaults to using all of the data (most accurate).

check.memory

If TRUE, reports expected number of convex hull simplices required for calculation and stops further memory allocation. Also warns if dimensionality is high.

verbose

If TRUE, prints diagnostic progress messages.

use.random

If TRUE and the input is of class Hypervolume, sets boundaries based on the @RandomPoints slot; otherwise uses @Data.

method

One of "rejection" (rejection sampling) or "hitandrun" (adaptive hit and run Monte Carlo sampling)

chunksize

Number of random points to process per internal step. Larger values may have better performance on machines with large amounts of free memory. Changing this parameter does not change the output of the function; only how this output is internally assembled.

Value

A Hypervolume-class object corresponding to the expectation hypervolume.

Details

The rejection sampling algorithm generates random points within a hyperbox enclosing the points, then sequentially tests whether each is in or out of the convex polytope based on a dot product test. It becomes exponentially inefficient in high dimensionalities. The hit-and-run sampling algorithm generates a Markov chain of samples that eventually converges to the true distribution of points within the convex polytope. It performs better in high dimensionalities but may not converge quickly. It will also be slow if the number of simplices on the convex polytope is large.

Both algorithms may become impracticably slow in >= 6 or 7 dimensions.

Examples

Run this code

# NOT RUN {
data(iris)
e_convex <- expectation_convex(iris[,1:3], check.memory=FALSE)
# }

Run the code above in your browser using DataLab