Learn R Programming

semiArtificial (version 2.4.1)

Generator of Semi-Artificial Data

Description

Contains methods to generate and evaluate semi-artificial data sets. Based on a given data set different methods learn data properties using machine learning algorithms and generate new data with the same properties. The package currently includes the following data generators: i) a RBF network based generator using rbfDDA() from package 'RSNNS', ii) a Random Forest based generator for both classification and regression problems iii) a density forest based generator for unsupervised data Data evaluation support tools include: a) single attribute based statistical evaluation: mean, median, standard deviation, skewness, kurtosis, medcouple, L/RMC, KS test, Hellinger distance b) evaluation based on clustering using Adjusted Rand Index (ARI) and FM c) evaluation based on classification performance with various learning models, e.g., random forests.

Copy Link

Version

Install

install.packages('semiArtificial')

Monthly Downloads

297

Version

2.4.1

License

GPL-3

Last Published

September 23rd, 2021

Functions in semiArtificial (2.4.1)

newdata

Generate semi-artificial data using a generator
treeEnsemble

A data generator based on forest
rbfDataGen

A data generator based on RBF network
semiArtificial-package

Generation and evaluation of semi-artificial data
dataSimilarity

Evaluate statistical similarity of two data sets
dsClustCompare

Evaluate clustering similarity of two data sets
performanceCompare

Evaluate similarity of two data sets based on predictive performance
cleanData

Rejection of new instances based on their distance to existing instances