Get differences between two data.frames
compareDiff(
newData,
oldData,
referenceVars = intersect(colnames(newData), colnames(oldData)),
changeableVars = NULL
)
Object of class 'diff.data', i.e. a data.frame with columns:
'Comparison type': type of difference between the old and new data, either:
'Change': records present both in new and old data, based on the reference variables, but with difference(s) in changeable vars
'Addition': records with reference variables present in new but not in old data
'Removal': records with reference variables present in old but not in new data
'Version': 'Previous' or 'Current' depending if record represents content from old or new data respectively
referenceVars
changeableVars
data.frame object representing the new data
data.frame object representing the old data
character vector of the columns in the data that are the used as
reference for the comparison.
If not specified, all columns present both in
newData
and oldData
are considered.
character vector of the columns in the data for which you want to assess the change,
e.g. variables that might have changed from the old to the new data.
If not specified, only 'Addition' and 'Removal' are detected.
To identify the differences between datasets, the following steps are followed:
removal of records identical between the old and new dataset (will be considered as 'Identical' later on)
records with a reference value present in the old dataset but not in the new dataset are considered 'Removal'
records with a reference value present in the new dataset but not in the old dataset are considered 'Addition'
records with reference value present both in the new and old dataset, after filtering of identical records and with difference in the changeable variables are considered 'Change'
Laure Cougnaud