Learn R Programming

hyperSMURF (version 2.0)

imbalanced.data.generator: Synthetic imbalanced data generator

Description

A variable number of minority and majority class examples are generated. All the features of the majority class are distributed according to a Gaussian distribution with mean=0 and sd=1. Of the overall n.features, n.inf. features of the minority class are distributed according to a gaussian centered in 1 with standard deviation sd.

Usage

imbalanced.data.generator(n.pos=100, n.neg=2000, 
   n.features=10, n.inf.features=2, sd=1, seed=0)

Arguments

n.pos

number of positive (minority class) examples (def. 100)

n.neg

number of negative (majority class) examples (def. 2000)

n.features

total number of features (def. 10)

n.inf.features

number of informative features (def. 2)

sd

standard deviation of the informative features (def.1)

seed

initialization seed for the random number generator. If 0 (def) current clock time is used.

Value

A list with two elements:

data

the matrix of the synthetic data having pos+n.neg rows and n.features columns

labels

a factor with the labels of he examples: 1 for minority and 0 for majority class

Examples

Run this code
# NOT RUN {
imbalanced.data.generator(n.pos=10, n.neg=200, n.features=6, n.inf.features=2, sd=1)
# }

Run the code above in your browser using DataLab