Learn R Programming

sdcMicro (version 5.7.8)

dataGen: Fast generation of synthetic data

Description

Fast generation of (primitive) synthetic multivariate normal data.

Usage

dataGen(obj, ...)

Value

the generated synthetic data.

Arguments

obj

an sdcMicroObj-class-object or a data.frame

...

see possible arguments below

n:

amount of observations for the generated data, defaults to 200

use:

howto compute covariances in case of missing values, see also argument use in cov. The default choice is 'everything', other possible choices are 'all.obs', 'complete.obs', 'na.or.complete' or 'pairwise.complete.obs'.

Author

Matthias Templ

Details

Uses the cholesky decomposition to generate synthetic data with approx. the same means and covariances. For details see at the reference.

References

Mateo-Sanz, Martinez-Balleste, Domingo-Ferrer. Fast Generation of Accurate Synthetic Microdata. International Workshop on Privacy in Statistical Databases PSD 2004: Privacy in Statistical Databases, pp 298-306.

See Also

sdcMicroObj-class, shuffle

Examples

Run this code
data(mtcars)
# \donttest{
cov(mtcars[,4:6])
cov(dataGen(mtcars[,4:6]))
pairs(mtcars[,4:6])
pairs(dataGen(mtcars[,4:6]))

## for objects of class sdcMicro:
data(testdata2)
sdc <- createSdcObj(testdata2,
  keyVars=c('urbrur','roof','walls','water','electcon','relat','sex'),
  numVars=c('expend','income','savings'), w='sampling_weight')
sdc <- dataGen(sdc)
# }

Run the code above in your browser using DataLab