comparedf
functionControl tolerance definitions for the comparedf
function.
comparedf.control(
tol.logical = "none",
tol.num = c("absolute", "percent", "pct"),
tol.num.val = sqrt(.Machine$double.eps),
int.as.num = FALSE,
tol.char = c("none", "trim", "case", "both"),
tol.factor = c("none", "levels", "labels"),
factor.as.char = FALSE,
tol.date = "absolute",
tol.date.val = 0,
tol.other = "none",
tol.vars = "none",
max.print.vars = NA,
max.print.obs = NA,
max.print.diffs.per.var = 10,
max.print.diffs = 50,
max.print.attrs = NA,
...,
max.print.diff = 10
)
A function or one of the shortcut character strings or a list thereof, denoting the tolerance function to use for a given data type. See "details", below.
Numeric; maximum value of differences allowed in numerics (fed to the function given in tol.num
).
Logical; should integers be coerced to numeric before comparison? Default FALSE.
Logical; should factors be coerced to character before comparison? Default FALSE.
Numeric; maximum value of differences allowed in dates (fed to the function given in tol.date
).
Either "none"
(the default), denoting that variable names are to be matched as-is,
a named vector manually specifying variable names to compare (where the names correspond to columns of
x
and the values correspond to columns of y
), or a
character vector denoting equivalence classes for characters in the variable names. See "details", below.
Integer denoting maximum number of variables to report in the "variables not shared" and "variables not compared"
output. NA
will print all differences.
Integer denoting maximum number of not-shared observations to report. NA
will print all differences.
Integers denoting the maximum number of differences to report for each variable or overall.
NA
will print all differences for each variable or overall.
Integers denoting the maximum number of non-identical attributes to report.NA
will print all differences.
Other arguments (not in use at this time).
Deprecated.
A list containing the necessary parameters for the comparedf
function.
The following character strings are accepted:
tol.logical = "none"
: compare logicals exactly as they are.
tol.num = "absolute"
: compare absolute differences in numerics.
tol.num = "percent"
, tol.num = "pct"
compare percent differences in numerics.
tol.char = "none"
: compare character strings exactly as they are.
tol.char = "trim"
: left-justify and trim all trailing white space.
tol.char = "case"
: allow differences in upper/lower case.
tol.char = "both"
: combine "trim"
and "case"
.
tol.factor = "none"
: match both character labels and numeric levels.
tol.factor = "levels"
: match only the numeric levels.
tol.factor = "labels"
: match only the labels.
tol.date = "absolute"
: compare absolute differences in dates.
tol.other = "none"
: expect objects of other classes to be exactly identical.
A list with names mapped to x
can be used to specify tolerances by variable. One unnamed element is supported
as the default.
tol.vars
: If not set to "none"
(the default) or a named vector,
the tol.vars
argument is a character vector denoting equivalence classes
for the characters in the variable names. A single character in this vector means to replace that character
with ""
. All other strings in this vector are split by character and replaced by the first character in the string.
E.g., a character vector c("._", "aA", " ")
would denote that the dot and underscore are equivalent (to be translated to a dot),
that "a" and "A" are equivalent (to be translated to "a"), and that spaces should be removed.
The special character string "case"
in this vector is the same as specifying paste0(letters, LETTERS)
.
# NOT RUN {
cntl <- comparedf.control(
tol.num = "pct", # calculate percent differences
tol.vars = c("case", # ignore case
"._", # set all underscores to dots.
"e") # remove all letter e's
)
cntl <- comparedf.control(tol.char = list(
"none", # the default
x1 = "case", # be case-insensitive for the variable "x1"
x2 = function(x, y) tol.NA(x, y, x != y | y == "NA") # a custom-defined tolerance
))
# }
Run the code above in your browser using DataLab