Learn R Programming

synthpop (version 1.9-0)

syn.satcat: Synthesis from a saturated model based on all combinations of the predictor variables.

Description

Synthesises one variable (y) from all possible combinations of its predictors (x). A bootstrap sample is created from the original values of y within each unique combinations of of xp (the synthesisied values of the grouping variable). Note that only combinations of predictor variable levels that appear in the original data can be in the synthetic data. The related method (syn.catall) overcomes this by adding a small prior probability to all zero cells in the cross tabulation from the original data that are not structural zeros. But it has the limitation of requiring a complete cross tabulation of all the variables.

Usage

syn.satcat(y, x, xp, proper = FALSE, ...)

Value

A list with two components:

res

a data frame of dimension k x p containing the synthesised data.

fit

the cross-tabulation of the original predictor variables.

Arguments

y

an original data vector of length n for the satcat variable.

x

a matrix (n x p) with the original predictor variables for y.

xp

a matrix (k x p) with synthetic values of x.

proper

if proper = TRUE x and y are replaced with a bootstrap sample before synthesis, thus effectively sampling from the posterior distribution of the model, given the data.

...

additional parameters.

Details

It is intended that the variables in x are categorical (factor) variables. If y is also a categorical variable syn.satcat will give the same results as fitting a saturated polychotomous regression model but will usually be much faster. syn.satcat will fail with an error message if previous syntheses have generated a combination of variables in xp that was not present in x. Use of the syn.catall method for grouped variables can overcome this.

Examples

Run this code
ods <- SD2011[, c("region", "sex", "agegr", "placesize")]

s1 <- syn(ods, method =  "satcat", seed = 7856)
s2 <- syn(ods, method = c("sample", "cart", "satcat", "cart"), seed = 7856)

if (FALSE) {
### mostly fails because previous synthesis has produced 
### combinations not found in the original data
s3 <- syn(ods, method = c("sample", "cart", "cart", "satcat"), seed = 7856)}

Run the code above in your browser using DataLab