Learn R Programming

semiArtificial (version 2.4.1)

semiArtificial-package: Generation and evaluation of semi-artificial data

Description

The package semiArtificial contains methods to generate and evaluate semi-artificial data sets. Different data generators take a data set as an input, learn its properties using machine learning algorithms and generates new data with the same properties.

Arguments

Details

The package currently includes the following data generators:

  • a RBF network based generator using rbfDDA model from RSNNS package.

  • generator using density tree forest for unsupervised data,

  • generator using random forest for classification and regression.

Data evaluation support tools include:

  • statistical evaluation: mean, median,standard deviation, skewness, kurtosis, medcouple, L/RMC,

  • evaluation based on clustering using Adjusted Rand Index (ARI) and Fowlkes-Mallows index (FM),

  • evaluation based on prediction with a model, e.g., random forests.

Further software and development versions are available at http://lkm.fri.uni-lj.si/rmarko/software/.

References

Marko Robnik-Sikonja: Not enough data? Generate it!. Technical Report, University of Ljubljana, Faculty of Computer and Information Science, 2014

Other references are available from http://lkm.fri.uni-lj.si/rmarko/papers/

See Also

rbfDataGen, treeEnsemble, newdata, dataSimilarity, dsClustCompare, performanceCompare.