Learn R Programming

ff (version 4.5.0)

ffindexorder: Sorting: chunked ordering of integer suscript positions

Description

Function ffindexorder will calculate chunkwise the order positions to sort all positions in a chunk ascending.
Function ffindexordersize does the calculation of the chunksize for ffindexorder.

Usage

ffindexordersize(length, vmode, BATCHBYTES = getOption("ffmaxbytes"))
ffindexorder(index, BATCHSIZE, FF_RETURN = NULL, VERBOSE = FALSE)

Value

Function ffindexorder returns an ff integer vector with an attribute BATCHSIZE (the chunksize finally used, not the one given with argument BATCHSIZE).


Function ffindexordersize returns a balanced batchsize as returned from bbatch.

Arguments

index

A ff integer vector with integer subscripts.

BATCHSIZE

Limit for the chunksize (see details)

BATCHBYTES

Limit for the number of bytes per batch

FF_RETURN

Optionally an ff integer vector in which the chunkwise order positions are stored.

VERBOSE

Logical scalar for activating verbosing.

length

Number of elements in the index

vmode

The vmode of the ff vector to which the index shall be applied with ffindexget or ffindexset

Author

Jens Oehlschlägel

Details

Accessing integer positions in an ff vector is a non-trivial task, because it could easily lead to random-access to a disk file. We avoid random access by loading batches of the subscript values into RAM, order them ascending, and only then access the ff values on disk. Such an ordering can be done on-the-fly by ffindexget or it can be created upfront with ffindexorder, stored and re-used, similar to storing and using hybrid index information with as.hi.

See Also

ffindexget, as.hi, bbatch

Examples

Run this code
     x <- ff(sample(40))
     message("fforder requires sorting")
     i <- fforder(x)
     message("applying this order i is done by ffindexget")
     x[i]
     message("applying this order i requires random access, 
       therefore ffindexget does chunkwise sorting")
     ffindexget(x, i)
     message("if we want to apply the order i multiple times,
       we can do the chunkwise sorting once and store it")
     s <- ffindexordersize(length(i), vmode(i), BATCHBYTES = 100)
     o <- ffindexorder(i, s$b)
     message("this is how the stored chunkwise sorting is used")
     ffindexget(x, i, o)
     message("")
     rm(x,i,s,o)
     gc()

Run the code above in your browser using DataLab