Learn R Programming

JOUSBoost (version 2.1.0)

friedman_data: Simulate data from the Friedman model

Description

Simulate draws from a bernoulli distribution over c(-1,1), where the log-odds is defined according to: $$log{p(y=1|x)/p(y=-1|x)} = gamma*(1 - x_1 + x_2 - ... + x_6)*(x_1 + x_2 + ... + x_6)$$ and \(x\) is distributed as N(0, I_dxd). See Friedman (2000).

Usage

friedman_data(n = 500, d = 10, gamma = 10)

Arguments

n

Number of points to simulate.

d

The dimension of the predictor variable \(x\).

gamma

A parameter controlling the Bayes error, with higher values of gamma corresponding to lower error rates.

Value

Returns a list with the following components:

y

Vector of simulated response in c(-1,1).

X

An nxd matrix of simulated predictors.

p

The true conditional probability \(p(y=1|x)\).

References

Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (with discussion), Annals of Statistics 28: 337-307.

Examples

Run this code
# NOT RUN {
set.seed(111)
dat = friedman_data(n = 500, gamma = 0.5)

# }

Run the code above in your browser using DataLab