Learn R Programming

sdcMicro (version 5.7.8)

recordSwap_cpp: Targeted Record Swapping

Description

Applies targeted record swapping on micro data set, see ?recordSwap for details.
NOTE: This is an internal function called by the R-function recordSwap(). It's only purpose is to include the C++-function recordSwap() using Rcpp.

Usage

recordSwap_cpp(
  data,
  hid,
  hierarchy,
  similar_cpp,
  swaprate,
  risk,
  risk_threshold,
  k_anonymity,
  risk_variables,
  carry_along,
  log_file_name,
  seed = 123456L
)

Value

Returns data set with swapped records.

Arguments

data

micro data set containing only integer values. A data.frame or data.table from R needs to be transposed beforehand so that data.size() ~ number of records - data.[0].size ~ number of varaibles per record. NOTE: data has to be ordered by hid beforehand.

hid

column index in data which refers to the household identifier.

hierarchy

column indices of variables in data which refers to the geographic hierarchy in the micro data set. For instance county > municipality > district.

similar_cpp

List where each entry corresponds to column indices of variables in data which should be considered when swapping households.

swaprate

double between 0 and 1 defining the proportion of households which should be swapped, see details for more explanations

risk

vector of vectors containing risks of each individual in each hierarchy level.

risk_threshold

double indicating risk threshold above every household needs to be swapped.

k_anonymity

integer defining the threshold of high risk households (k-anonymity). This is used as k_anonymity <= counts.

risk_variables

column indices of variables in data which will be considered for estimating the risk.

carry_along

integer vector indicating additional variables to swap besides to hierarchy variables. These variables do not interfere with the procedure of finding a record to swap with or calculating risk. This parameter is only used at the end of the procedure when swapping the hierarchies.

log_file_name

character, path for writing a log file. The log file contains a list of household IDs (`hid`) which could not have been swapped and is only created if any such households exist.

seed

integer defining the seed for the random number generator, for reproducibility.