This help page lists the currently known limitations of package ff, as well as differences between ff and ram methods.
Remind that not giving parameter ff(filename=)
will result in a temporary file in fftempdir
with 'delete' finalizer,
while giving parameter ff(filename=)
will result in a permanent file with 'close' finalizer.
Do avoid setting setwd(getOption("fftempdir"))
!
Make sure you really understand the implications of automatic unlinking of getOption("fftempdir") .onUnload
,
of finalizer choice and of finalizing behaviour at the end of R sessions as defaulted in getOption("fffinonexit").
Otherwise you might experience 'unexpected' losses of files and data.
ff objects can have length zero and are limited to .Machine$integer.max
elements. We have not yet ported the R code to support 64bit double indices (in essence 52 bits integer) although the C++ back-end has been prepared for this.
Furthermore filesize limitations of the OS apply, see ff
.
In contrast to standard R expressions, ff expressions violate the functional programming logic and are called for their side effects.
This is also true for ram compatibility functions swap.default
, and add.default
.
If you modify a copy of an ff object, changes of data ([<-
) and of physical
attributes
will be shared, but changes in virtual
and class attributes will not.
If it's not too big, you can move an ff object completely into R's RAM through as.ram
.
However, you should watch out for three limitations:
Ram objects don't have hybrid copying semantics; changes to a copy of a ram object will never change the original ram object
Assigning values to a ram object can easily upgrade to a higher storage.mode
. This will create conflicts with the
vmode
of the ram object, which goes undetected until you try to write back to disk through as.ff
.
Writing back to disk with as.ff
under the same filename requires that the original ff object has been deleted
(or at least closed if you specify parameter overwrite=TRUE
).
Parameter bydim
is only available in ff access methods, see [.ff
Parameter add
is only available in ff access methods, see [.ff
If index expressions contain duplicated positions, the ff and ram methods for swap
and add
will behave differently, see swap
.
You should consider the behaviour of [[.ff
and
[[<-.ff
as undefined and not use them in programming.
Currently they are shortcuts to get.ff
and set.ff
,
which unlike [.ff
and [<-.ff
do not support factor
and POSIXct
,
nor dimorder
or virtual windows vw
.
In contrast to the standard methods, [[.ff
and
[[<-.ff
only accepts positive integer index positions.
The definition of [[.ff
and [[<-.ff
may be
changed in the future.
R objects have always standard dimorder seq_along(dim)
.
In case of non-standard dimorder (see dimorderStandard
)
the vector sequence of array elements in R and in the ff file differs.
To access array elements in file order, you can use getset.ff
, readwrite.ff
or copy the ff object and set dim(ff)<-NULL
to get a vector view into the ff object
(using [
dispatches the vector method [.ff
).
To access the array elements in R standard dimorder you simply use [
which dispatches
to [.ff_array
. Note that in this case as.hi
will unpack the complete index, see next section.
Some index expressions do not consume RAM due to the hi
representation.
For example 1:n
will almost consume no RAM however large n.
However, some index expressions are expanded and require to maxindex(i) * .rambytes["integer"]
bytes,
either because the sorted sequence of index positions cannot be rle-packed efficiently
or because hiparse
cannot yet parse such expression and falls back to evaluating/expanding the index expression.
If the index positions are not sorted, the index will be expanded and a second vector is needed to store the information for re-ordering,
thus the index requires 2 * maxindex(i) * .rambytes["integer"]
bytes.
Some assignment expressions do not consume RAM for recycling. For example x[1:n] <- 1:k
will not consume RAM however large is n compared to k, when x has standard dimorder
.
However, if length(value)>1
, assignment expressions with non-ascending index positions trigger recycling the value R-side to the full index length.
This will happen if dimorder
does not match parameter bydim
or if the index is not sorted in ascending order.
Note that ff files cannot been transferred between systems with different byteorder.