Create a file cache object
Create a file cache object
Cache pruning occurs when set_file()
or set_content()
is called, or it
can be invoked manually by calling prune()
.
The disk cache will throttle the pruning so that it does not happen on
every call to set_file()
or set_content()
, because the filesystem
operations for checking the status of files can be slow. Instead, it will
prune once in every 20 calls to set_file()
or set_content()
, or if at
least 5 seconds have elapsed since the last prune occurred, whichever is
first. These parameters are currently not customizable, but may be in the
future.
When a pruning occurs, if there are any objects that are older than
max_age
, they will be removed.
The max_size
and max_n
parameters are applied to the cache as a whole,
in contrast to max_age
, which is applied to each object individually.
If the number of objects in the cache exceeds max_n
, then objects will be
removed from the cache according to the eviction policy, which is set with
the evict
parameter. Objects will be removed so that the number of items
is max_n
.
If the size of the objects in the cache exceeds max_size
, then objects
will be removed from the cache. Objects will be removed from the cache so
that the total size remains under max_size
. Note that the size is
calculated using the size of the files, not the size of disk space used by
the files --- these two values can differ because of files are stored in
blocks on disk. For example, if the block size is 4096 bytes, then a file
that is one byte in size will take 4096 bytes on disk.
Another time that objects can be removed from the cache is when
get_file()
or get_content()
is called. If the target object is older
than max_age
, it will be removed and the cache will report it as a
missing value.
If max_n
or max_size
are used, then objects will be removed from the
cache according to an eviction policy. The available eviction policies are:
"lru"
Least Recently Used. The least recently used
objects will be removed. This uses the filesystem's mtime property. When
"lru" is used, each time get_file()
or get_content()
is called, it will
update the file's mtime.
"fifo"
First-in-first-out. The oldest objects will be removed.
Both of these policies use files' mtime. Note that some filesystems (notably FAT) have poor mtime resolution. (atime is not used because support for atime is worse than mtime.)
The directory for a FileCache can be shared among multiple R processes. To do this, each R process should have a FileCache object that uses the same directory. Each FileCache will do pruning independently of the others, so if they have different pruning parameters, then one FileCache may remove cached objects before another FileCache would do so.
Even though it is possible for multiple processes to share a FileCache directory, this should not be done on networked file systems, because of slow performance of networked file systems can cause problems. If you need a high-performance shared cache, you can use one built on a database like Redis, SQLite, mySQL, or similar.
When multiple processes share a cache directory, there are some potential
race conditions. For example, if your code calls exists(key)
to check if
an object is in the cache, and then call get_file(key)
, the object may be
removed from the cache in between those two calls, and get_file(key)
will
throw an error. Instead of calling the two functions, it is better to
simply call get_file(key)
, and use tryCatch()
to handle the error that
is thrown if the object is not in the cache. This effectively tests for
existence and gets the object in one operation.
It is also possible for one processes to prune objects at the same time
that another processes is trying to prune objects. If this happens, you may
see a warning from file.remove()
failing to remove a file that has
already been deleted.
new()
Create a FileCache object.
FileCache$new(
dir = NULL,
max_size = 40 * 1024^2,
max_age = Inf,
max_n = Inf,
evict = c("lru", "fifo"),
destroy_on_finalize = FALSE,
logfile = NULL
)
dir
Directory to store files for the cache. If NULL
(the default) it
will create and use a temporary directory.
max_size
Maximum size of the cache, in bytes. If the cache exceeds
this size, cached objects will be removed according to the value of the
evict
. Use Inf
for no size limit.
max_age
Maximum age of files in cache before they are evicted, in
seconds. Use Inf
for no age limit.
max_n
Maximum number of objects in the cache. If the number of objects
exceeds this value, then cached objects will be removed according to the
value of evict
. Use Inf
for no limit of number of items.
evict
The eviction policy to use to decide which objects are removed
when a cache pruning occurs. Currently, "lru"
and "fifo"
are supported.
destroy_on_finalize
If TRUE
, then when the FileCache object is
garbage collected, the cache directory and all objects inside of it will be
deleted from disk. If FALSE
(the default), it will do nothing when
finalized.
logfile
An optional filename or connection object to where logging
information will be written. To log to the console, use stdout()
.
get_file()
Get the content associated with key
, and save in a file
named outfile
.
FileCache$get_file(key, outfile, overwrite = TRUE)
key
Key. Must be lowercase numbers and letters.
outfile
Name of output file. If NULL
, return the content as
overwrite
If the output file already exists, should it be overwritten?
TRUE
if the object is found in the cache and copying succeeds,
FALSE
otherwise.
get_content()
Get the content associated with key
, and return as either
string or a raw vector.
FileCache$get_content(key, mode = c("text", "raw"))
key
Key. Must be lowercase numbers and letters.
mode
If "text"
, return the content as a UTF-8-encoded text
string (a one element char vector). If "raw"
, return the content as a
raw vector.
A character or raw vector if the object is found in the cache,
NULL
otherwise.
set_file()
Sets content associated with key
, from a file named
infile
.
FileCache$set_file(key, infile)
key
Key. Must be lowercase numbers and letters.
infile
Name of input file.
TRUE
if copying the file into the cache succeeds, FALSE
otherwise.
set_content()
Sets content associated with key
, from a single-element
vector.
FileCache$set_content(key, content)
key
Key. Must be lowercase numbers and letters.
content
A character or raw vector. If it is a character vector,
it will be written with UTF-8 encoding, with with elements collapsed
with \\n
(consistent across platforms).
TRUE
if setting the content in the cache succeeds, FALSE
otherwise.
key
Key. Must be lowercase numbers and letters.
TRUE
if the object is in the cache, FALSE
otherwise.
keys()
Get all keys
FileCache$keys()
A character vector of all keys currently in the cache.
key
Key. Must be lowercase numbers and letters.
TRUE
if the object was found and successfully removed, FALSE
otherwise.
reset()
Clear all objects from the cache.
FileCache$reset()
prune()
Prune the cache, using the parameters specified by
max_size
, max_age
, max_n
, and evict
.
FileCache$prune()
size()
Return the number of items currently in the cache.
FileCache$size()
destroy()
Clears all objects in the cache, and removes the cache directory from disk.
FileCache$destroy()
is_destroyed()
Reports whether the cache has been destroyed.
FileCache$is_destroyed(throw = FALSE)
throw
Should this function throw an error if the cache has been destroyed?
finalize()
A finalizer for the cache.
FileCache$finalize()
clone()
The objects of this class are cloneable with this method.
FileCache$clone(deep = FALSE)
deep
Whether to make a deep clone.
A file cache object is a key-file store that saves the values as files in a
directory on disk. The objects are files on disk. They are stored and
retrieved using the get_file()
, get_content()
, set_file()
, and
set_content()
methods. Objects are automatically pruned from the cache
according to the parameters max_size
, max_age
, max_n
, and evict
.