option("collapse_unused_arg_action")
regulates how generic functions (such as the Fast Statistical Functions) in the package react when an unknown argument is passed to a method. The default action is "warning"
which issues a warning. Other options are "error"
, "message"
or "none"
, whereby the latter enables silent swallowing of such arguments.
option("collapse_mask")
can be used to create additional functions in the collapse namespace when loading the package, which will mask some existing base R and dplyr functions. In particular, collapse provides a large number of functions that start with 'f' e.g. fsubset
, ftransform
, fdroplevels
etc.. Specifying options(collapse_mask = c("fsubset", "ftransform", "fdroplevels"))
before loading the package will make additional functions subset
, transform
, and droplevels
available to the user, and mask the corresponding base R functions when the package is attached. In general, all functions starting with 'f' can be passed to the option. There are also a couple of keywords that you can specify to add groups of functions:
"manip"
adds data manipulation functions: fsubset, ftransform, ftransform<-, ftransformv, fcompute, fcomputev, fselect, fselect<-, fgroup_by, fgroup_vars, fungroup, fsummarise, fmutate, frename
"helper"
adds the functions: fdroplevels
, finteraction
, funique
, fnlevels
, fnrow
and fncol
.
"fast-fun"
adds the functions contained in the macro: .FAST_FUN
.
"fast-stat-fun"
adds the functions contained in the macro: .FAST_STAT_FUN
.
"fast-trfm-fun"
adds the functions contained in: setdiff(.FAST_FUN, .FAST_STAT_FUN)
.
"all"
turns on all of the above.
Note that none of these options will impact internal collapse code, but they may change the way your programs run. "manip"
is probably the safest option to start with.
Specifying "fast-fun"
, "fast-stat-fun"
, "fast-trfm-fun"
or "all"
are ambitious as they replace basic R functions like sum
and max
, introducing collapse's na.rm = TRUE
default and different behavior for matrices and data frames, and these options also changes some internal macros so that base R functions like sum
or max
called inside fsummarise
, fmutate
or collap
will also receive vectorized execution. In other words, if you put options(collapse_mask = "all")
before loading the package, and you have a collapse-compatible line of dplyr code like wlddev |> group_by(region, income) |> summarise(across(PCGDP:POP, sum))
, this will now receive fully optimized execution. Note however that because of collapse
's na.rm = TRUE
default, the result will be different unless you add na.rm = FALSE
. In General, this option is for your convenience, if you want to write visually more appealing code or you want to translate existing dplyr codes to collapse. Use with care! For production code I generally recommend not using it.
option("collapse_F_to_FALSE")
, if set to TRUE
, replaces the lead operator F
in the package with a value FALSE
when loading the package, which solves issues arising from the use of F
as a shortcut for FALSE
in R codes when collapse is attached. Note that F
is just a value in the base package namespace, and it should NOT be used in production codes, precisely because users can overwrite it by assignment. An alternative solution to invoking this option would also just be assigning a value F <- FALSE
in your global environment.
option("collapse_DT_alloccol")
sets how many empty columns collapse data manipulation functions like ftransform
allocate when taking a shallow copy of data.table's. The default is 100L
. Note that the data.table default is getOption("datatable.alloccol") = 1024L
. I chose a lower default because shallow copies are taken by each data manipulation function if you manipulate data.table's with collapse, and the cost increases with the number of overallocated columns. With 100 columns, the cost is 2-5 microseconds per copy.