Applies targeted record swapping on micro data set, see ?recordSwap
for details.
NOTE: This is an internal function called by the R-function recordSwap()
. It's only purpose is to include the C++-function recordSwap() using Rcpp.
recordSwap_cpp(
data,
hid,
hierarchy,
similar_cpp,
swaprate,
risk,
risk_threshold,
k_anonymity,
risk_variables,
carry_along,
log_file_name,
seed = 123456L
)
Returns data set with swapped records.
micro data set containing only integer values. A data.frame or data.table from R needs to be transposed beforehand so that data.size() ~ number of records - data.[0].size ~ number of varaibles per record. NOTE: data has to be ordered by hid beforehand.
column index in data
which refers to the household identifier.
column indices of variables in data
which refers to the geographic hierarchy in the micro data set. For instance county > municipality > district.
List where each entry corresponds to column indices of variables in data
which should be considered when swapping households.
double between 0 and 1 defining the proportion of households which should be swapped, see details for more explanations
vector of vectors containing risks of each individual in each hierarchy level.
double indicating risk threshold above every household needs to be swapped.
integer defining the threshold of high risk households (k-anonymity). This is used as k_anonymity <= counts.
column indices of variables in data
which will be considered for estimating the risk.
integer vector indicating additional variables to swap besides to hierarchy variables. These variables do not interfere with the procedure of finding a record to swap with or calculating risk. This parameter is only used at the end of the procedure when swapping the hierarchies.
character, path for writing a log file. The log file contains a list of household IDs (`hid`) which could not have been swapped and is only created if any such households exist.
integer defining the seed for the random number generator, for reproducibility.