The package semiArtificial contains methods to generate and evaluate semi-artificial data sets. Different data generators take a data set as an input, learn its properties using machine learning algorithms and generates new data with the same properties.
The package currently includes the following data generators:
a RBF network based generator using rbfDDA model from RSNNS package.
generator using density tree forest for unsupervised data,
generator using random forest for classification and regression.
Data evaluation support tools include:
statistical evaluation: mean, median,standard deviation, skewness, kurtosis, medcouple, L/RMC,
evaluation based on clustering using Adjusted Rand Index (ARI) and Fowlkes-Mallows index (FM),
evaluation based on prediction with a model, e.g., random forests.
Further software and development versions are available at http://lkm.fri.uni-lj.si/rmarko/software/.
Marko Robnik-Sikonja: Not enough data? Generate it!. Technical Report, University of Ljubljana, Faculty of Computer and Information Science, 2014
Other references are available from http://lkm.fri.uni-lj.si/rmarko/papers/
rbfDataGen
,
treeEnsemble
,
newdata
,
dataSimilarity
,
dsClustCompare
,
performanceCompare
.