Learn R Programming

synthpop (version 1.9-0)

replicated.uniques: Replications in synthetic data

Description

Determines which unique units in the synthesised data set(s) have combinations of variables in the keys as follows:

1) unique in original data

2) unique in the synthetic data set(s)

3) unique in synthetic data and present,but not necessarily unique in original

4) unique in synthetic and unique in original.

For each of 3) and 4) results are returned that identify the rows in the synthetic data with each type of unique. This function is called by sdc where there are options to include each type of unique.

Usage

replicated.uniques(object, data, keys = names(data))

# S3 method for repuniq.synds print(x, ...)

Value

A list of class "repuniq.synds" with the following components:

m

number of synthetic data sets in object object$m

n

number of rows in data object$n

k

number of rows in of synthetic data set(s) in object object$k

res_tab

Table or list of tables with numbers and percentages of uniques

synU.rm

A vector of length object$k TRUE/FALSE values wherea TRUE value identifies a unique in synthetic and prtesent in the original

repU.rm

A vector of length object$k TRUE/FALSE values where a TRUE value identifies a replicated unique

Arguments

object

an object of class synds, which stands for 'synthesised data set'. It is typically created by function syn() and it includes object$m synthesised data set(s).

data

the original observed data set.

keys

Variables to be used as quasi-identifiers to check for unique combinations.

...

additional parameters

x

an object of class repuniq.synds; a result of a call to replicated.uniques().

See Also

sdc

Examples

Run this code
ods <- SD2011[1:1000,c("sex","age","region","edu","marital","smoke")]
s1 <- syn(ods, m = 2)
replicated.uniques(s1,ods, keys = c("sex","age","region"))

Run the code above in your browser using DataLab