ff (version 4.0.2)

ffdfindexget: Reading and writing ffdf data.frame using ff subscripts


Function ffdfindexget allows to extract rows from an ffdf data.frame according to positive integer suscripts stored in an ff vector.

Function ffdfindexset allows the inverse operation: assigning to rows of an ffdf data.frame according to positive integer suscripts stored in an ff vector. These functions allow more control than the method dispatch of [ and [<- if an ff integer subscript is used.


ffdfindexget(x, index, indexorder = NULL, autoindexorder = 3, FF_RETURN = NULL
  , BATCHSIZE = NULL, BATCHBYTES = getOption("ffmaxbytes"), VERBOSE = FALSE)
  ffdfindexset(x, index, value, indexorder = NULL, autoindexorder = 3
  , BATCHSIZE = NULL, BATCHBYTES = getOption("ffmaxbytes"), VERBOSE = FALSE)



A ffdf data.frame containing the elements


A ff integer vector with integer subscripts in the range from 1 to length(x).


A ffdf data.frame like x with the rows to be assigned


Optionally the return value of ffindexorder, see details


The minimum number of columns (which need chunked indexordering) for which we switch from on-the-fly ordering to stored ffindexorder


Optionally an ffdf data.frame of the same type as x in which the returned values shall be stored, see details.


Optinal limit for the batchsize (see details)


Limit for the number of bytes per batch


Logical scalar for verbosing


Function ffdfindexget returns a ffdf data.frame with those rows selected by the ff index vector. Function ffdfindexset returns x with those rows replaced that had been requested by index and value.


Accessing rows of an ffdf data.frame identified by integer positions in an ff vector is a non-trivial task, because it could easily lead to random-access to disk files. We avoid random access by loading batches of the subscript values into RAM, order them ascending, and only then access the ff values on disk. Such ordering is don on-thy-fly for upto autoindexorder-1 columns that need ordering. For autoindexorder o more columns we do the batched ordering upfront with ffindexorder and then re-use it in each call to ffindexget resp. ffindexset.

See Also

Extract.ff, ffindexget, ffindexorder


Run this code
message("ff integer subscripts with ffdf return/assign values")
x <- ff(factor(letters))
y <- ff(1:26)
d <- ffdf(x,y)
i <- ff(2:9)
di <- d[i,]
d[i,] <- di
message("ff integer subscripts: more control with ffindexget/ffindexset")
di <- ffdfindexget(d, i, FF_RETURN=di)
d <- ffdfindexset(d, i, di)
rm(x, y, d, i, di)
# }

