Learn R Programming

⚠️There's a newer version (0.3-5) of this package.Take me there.

High-performance I/O tools for R

Anyone dealing with large data knows that stock tools in R are bad at loading (non-binary) data to R. This package started as an attempt to provide high-performance parsing tools that minimize copying and avoid the use of strings when possible (see mstrsplit, for example).

To allow processing of arbitrarily large files we have added way to process chunk-wise input, making it possible to compute on streaming input as well as very large files (see chunk.reader and chunk.apply).

The next natural progress was to wrap support for Hadoop streaming. The major goal was to make it possible to compute using Hadoop Map Reduce by writing code that is very natural - very much like using lapply on data chunks without the need to know anything about Hadoop. See the WiKi page for the idea and hmr function for the documentation.

Copy Link

Version

Install

install.packages('iotools')

Monthly Downloads

1,724

Version

0.3-1

License

GPL-2 | GPL-3

Maintainer

Last Published

March 9th, 2020

Functions in iotools (0.3-1)

chunk

Functions for very fast chunk-wise processing
fdrbind

Fast row-binding of lists and data frames
chunk.apply

Process input by applying a function to each chunk
ctapply

Fast tapply() replacement functions
idstrsplit

Create an iterator for splitting binary or character input into a dataframe
write.csv.raw

Fast data output to disk
as.output

Character Output
chunk.map

Map a function over a file by chunks
dstrsplit

Split binary or character input into a dataframe
readAsRaw

Read binary data in as raw
.default.formatter

Default formatter, coorisponding to the as.output functions
imstrsplit

Create an iterator for splitting binary or character input into a matrix
dstrfw

Split fixed width input into a dataframe
mstrsplit

Split binary or character input into a matrix
output.file

Write an R object to a file as a character string
which.min.key

Determine the next key in bytewise order
line.merge

Merge multiple sources
input.file

Load a file on the disk
read.csv.raw

Fast data frame input