Functions to create cache that accelerates many operations
hashcache(x, nunique=NULL, ...)
sortcache(x, has.na = NULL)
sortordercache(x, has.na = NULL, stable = NULL)
ordercache(x, has.na = NULL, stable = NULL, optimize = "time")
x
with a cache
that contains the result of the expensive operations, possible together with small derived information (such as nunique.integer64
) and previously cached results.
an atomic vector (note that currently only integer64 is supported)
giving correct number of unique elements can help reducing the size of the hashmap
boolean scalar defining whether the input vector might contain NA
s. If we know we don't have NAs, this may speed-up.
Note that you risk a crash if there are unexpected NA
s with has.na=FALSE
boolean scalar defining whether stable sorting is needed. Allowing non-stable may speed-up.
by default ramsort optimizes for 'time' which requires more RAM, set to 'memory' to minimize RAM requirements and sacrifice speed
passed to hashmap
Jens Oehlschlägel <Jens.Oehlschlaegel@truecluster.com>
The result of relative expensive operations hashmap
, ramsort
, ramsortorder
and ramorder
can be stored in a cache in order to avoid multiple excutions. Unless in very specific situations, the recommended method is hashsortorder
only.
cache
for caching functions and nunique.integer64
for methods benefiting from small caches
x <- as.integer64(sample(c(rep(NA, 9), 1:9), 32, TRUE))
sortordercache(x)
Run the code above in your browser using DataLab