Learn R Programming

CORElearn (version 1.51.2)

regDataGen: Artificial data for testing regression algorithms

Description

The generator produces regression data data with 4 discrete and 7 numeric attributes.

Usage

regDataGen(noInst, t1=0.8, t2=0.5, noise=0.1)

Arguments

noInst

Number of instances to generate.

t1, t2

Parameters controlling the shape of the distribution.

noise

Parameter controlling the amount of noise. If noise=0, there is no noise. If noise = 1, then the level of the signal and noise are the same.

Value

Returns a data.frame with noInst rows and 11 columns. Range of values of the attributes and response are

a1

0,1

a2

a,b,c,d

a3

0,1 (irrelevant)

a4

a,b,c,d (irrelevant)

x1

numeric (gaussian with different sd for each class)

x2

numeric (gaussian with different sd for each class)

x3

numeric (gaussian, irrelevant)

x4

numeric from [0,1]

x5

numeric from [0,1]

x6

numeric from [0,1]

response

numeric

Details

The response variable is derived from x4, x5, x6 using two different functions. The choice depends on a hidden variable, which determines weather the response value would follow a linear dependency \(f=x_4-2x_5+3x_6\), or a nonlinear one \(f=cos(4\pi x_4)(2x_5-3x_6)\).

Attributes a1, a2, x1, x2 carry some information on the hidden variables depending on parameters t1, t2. Extreme values of the parameters are t1=0.5 and t2=1, when there is no information. On the other hand, if t1=0 or t1=1 then each of the attributes a1, a2 carries full information. If t2=0, then each of x1, x2 carries full information on the hidden variable.

The attributes x4, x5, x6 are available with a noise level depending on parameter noise. If noise=0, there is no noise. If noise=1, then the level of the signal and noise are the same.

See Also

classDataGen,ordDataGen,CoreModel,

Examples

Run this code
# NOT RUN {
#prepare a regression data set
regData <-regDataGen(noInst=200)

# build regression tree similar to CART
modelRT <- CoreModel(response ~ ., regData, model="regTree", modelTypeReg=1)
print(modelRT)

destroyModels(modelRT) # clean up

# }

Run the code above in your browser using DataLab