Learn R Programming

⚠️There's a newer version (1.3.11) of this package.Take me there.
CheckProviderStatus
Linux, OSXTravis
WindowsAppVeyor
ASAN, valgrindWercker
Code CoverageCodecov
CRAN VersionCRAN
DownloadsCRAN.RStudio.com

Note

hdf5r is an R interface to the HDF5 library. It is implemented using R6 classes based on the HDF5-C-API. The package supports all data-types as specified by HDF5 (including references) and provides many convenience functions yet also an extensive selection of the native HDF5-C-API functions. hdf5r is available on Github and has already been released on CRAN for all major platforms (Windows, OS X, Linux). It is also tested using several hundred assertions.

HDF5 is an excellent library and data model to store huge amounts of data in a binary file format. Supporting most major platforms and programming languages it can be used to exchange data files in a language independent format. Compared to R's integrated save() and load() functions it also supports access to only parts of the binary data files and can therefore be used to process data not fitting into memory.

Install

hdf5r is available for all major platforms, namely Linux, OS X and Windows. The package is compatible with HDF5 version 1.8.13 or higher (also Version 1.10.0).

Requirements

For OS X and Linux the HDF5 library needs to be installed via one of the (shell) commands specified below:

SystemCommand
OS X (using Homebrew)brew install hdf5
Debian-based systems (including Ubuntu)sudo apt-get install libhdf5-dev
Systems supporting yum and RPMssudo yum install hdf5-devel

HDF5 1.8.14 has been pre-compiled for Windows and is available at https://github.com/mannau/h5-libwin - thus no manual installation is required.

Basic Install

The latest release version of hdf5r can be installed from any CRAN Mirror using the R command

install.packages("hdf5r")

For the latest development version from Github you can use

devtools::install_github("hhoeflin/hdf5r")

Getting Started

How to Get Help

The package provides most of the regular HDF5-API in addition to a number of convenience functions. As such, the number of available methods is quite large. As the package uses R6 classes, all applicable methods for a class are contained in that class. The easiest way to get an overview of the available methods is to call the methods method.

native_int_type <- h5types$H5T_NATIVE_INT
native_int_type$methods()

Of course, there is also the regular R help that you can call for each class. These help pages tend to be long, as they also document all methods of that class.

help("H5T-class")

Last, I very much recomment reading the included vignette. You can view all vignettes included in the pacakge with

vignette(package="hdf5r")

and call up the introduction vignette with

vignette("hdf5r", package="hdf5r")

Simple Code Example

If you don't have time to read the vignette, which contains more code example, here is a very brief code example to create a file, write some data and read it back again.

test_file <- tempfile(fileext=".h5")
file.h5 <- H5File$new(test_file, mode="w")

data(cars)
file.h5$create_group("test")
file.h5[["test/cars"]] <- cars
cars_ds <- file.h5[["test/cars"]]
h5attr(cars_ds, "rownames") <- rownames(cars)

## Close the file at the end
## the 'close' method closes only the file-id, but leaves object inside the file open
## This may prevent re-opening of the file. 'close_all' closes the file and all objects in it
file.h5$close_all()
## now re-open it 
file.h5 <- H5File$new(test_file, mode="r+")

## lets look at the content
file.h5$ls(recursive=TRUE)

cars_ds <- file.h5[["test/cars"]]
## note that for now tables in HDF5 are 1-dimensional, not 2-dimensional
mycars <- cars_ds[]
h5attr_names(cars_ds)
h5attr(cars_ds, "rownames")

file.h5$close_all()

64-bit Integers

Please note that for 64-bit signed integers, the bit64 package is used. For technical reasons, it is possible for a function that is not bit64-aware to misrepresent 64bit values from the bit64 package as 'doubles' of a completely different value. Therefore, please be advised to ensure that the functions you are using are bit64-aware or cast the values to regular numeric values (but be aware - this may result in a loss of precision). For illustration of this issue see the difference between print(as.integer64(1)) and cat(as.integer64(1), "\n"). Another possible source of issues can be matrix(as.integer64(1)) or min(as.integer64(1), as.integer64(2)), among possibly others. By default, hdf5r tries to return regular R objects (integer or double) wherever this is possible without loss of precision. If you need 64bit integers, proceed with care keeping these issues in mind.

License

The hdf5r package is licensend under Apache License Version 2.0. HDF5 itself doesn't ship with the hdf5r package on Linux or Mac, but on windows the downloadable binary compiled on CRAN has the HDF5 binary included. The HDF5 Copyright notice can be found below.

hdf5r package

Copyright 2016 Novartis Institutes for BioMedical Research Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

HDF5

The licensing terms of HDF5 can as of this writing be found in the inst/HDF5_COPYRIGHTS file or online at

https://support.hdfgroup.org/ftp/HDF5/releases/COPYING

Copy Link

Version

Install

install.packages('hdf5r')

Monthly Downloads

17,021

Version

1.3.3

License

Apache License 2.0 | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

August 18th, 2020

Functions in hdf5r (1.3.3)

H5Group_access

Retrieve object from a group of file
H5P-class

Class for HDF5 property lists.
H5P_DATASET_CREATE-class

Class for HDF5 property list for dataset creation
H5P_DATASET_ACCESS-class

Class for HDF5 property list for dataset access
H5S_H5D_subset_assign

Selecting and assigning subsets of HDF5-Spaces and HDF5-Datasets
H5T-class

Class for HDF5 datatypes.
H5T_INTEGER-class

Class for HDF5 integer-datatypes.
create_empty

Create an empty R-object according to a given HDF5 datatype
H5T_FLOAT-class

Class for HDF5 floating point datatypes.
do_reshuffle

Reshuffle the input as needed - see args_regularity_evaluation
guess_chunks

Guess the dimension of a chunk
get_id

Get the id of an H5RefClass
H5P_DATASET_XFER-class

Class for HDF5 property list for dataset transfer
H5S-class

Class for representing HDF5 spaces
H5File-class

Class for interacting with HDF5 files.
H5P_DEFAULT-class

Class for default values for HDF5 property lists.
H5S_ALL-class

Class for HDF5 default space
H5P_FILE_ACCESS-class

Class for HDF5 property list for file creation
H5P_FILE_CREATE-class

Class for HDF5 property list for file creation
H5A-class

Class for representing HDF5 attributes
H5D-class

Class for representing HDF5 datasets
H5RefClass-class

Base class that tracks the ids and allows for closing an id
H5R_functions

Various functions for H5R objects
H5T_ARRAY-class

Class for HDF5 array datatypes.
H5T_COMPLEX-class

Class for HDF5 complex datatypes
H5GTD_factory

Wrap an HDF5-id in the appropriate class
H5Group-class

Class for representing HDF5 groups
H5T_factory

H5T_VLEN-class

Class for HDF5 variable-length datatypes.
H5P_CLASS-class

Class for HDF5 property list classes (not HDF5 property lists)
H5P_ATTRIBUTE_CREATE-class

Class for HDF5 property list for attribute creation
H5P_LINK_ACCESS-class

Class for HDF5 property list for link access
H5T_extractID

Extract HDF5-ids and return as a vector
H5_close_any

Closes any HDF5 id using the appropriate library function
array_counter

Cycle through n-dimensional array indices
H5P_OBJECT_COPY-class

Class for HDF5 property list for object copying
H5P_LINK_CREATE-class

Class for HDF5 property list for link creation
H5P_OBJECT_CREATE-class

Class for HDF5 property list for object creation
h5-wrapper

Wrapper functions to provide an h5 compatible interface.
h5attributes

Interface for HDF5 attributes
match.call.withDef

Match arguments in a call to function and add default values
array_reorder

Reorder an array
equal_id_check

Compare the ids of objects
h5const

All constants used in HDF5
expand_point_grid

Expand list of points for each dimension into a matrix of all combinations
names.H5Group

Get the names of the items in the group or at the / root of the file
H5R_OBJECT-class

Class for HDF5 Object-references.
H5R_DATASET_REGION-class

Class for HDF5 dataset-region references.
standalone_H5D_get_type

Get the id of a type of the dataset
regularity_eval_to_selection

Turn regulation evaluation into a selection for a space object
H5T_LOGICAL-class

Class for HDF5 logical datatypes. This is an enum with the 3 values FALSE, TRUE and NA mapped on values 0, 1 and 2. Is transparently mapped onto a logical variable
RToH5

Low-level conversion functions from R to HDF5 and vice versa
H5T_STRING-class

Class for HDF5 string datatypes.
is_hdf5

Check if a file is an HDF5 file
h5garbage_collect

Trigger the HDF5 garbage collection
list-groups-datasets

List Groups and Datasets in object
$.types_env

Retrieving a copy of a type
apply_selection

Apply a selection to a space
H5T_COMPOUND-class

Class for HDF5 compound datatypes.
H5T_ENUM-class

Class for HDF5 enumeration datatypes.
clean_ls_df

Cleaning result of internal R_H5ls function
as_hex

Convert a double or integer to hex
check_arg_for_hyperslab_func

Check argument for known functions that encode a hyperslab
convertRoundTrip

Round-trip of converting data to HDF5 and back to R
guess_space

Guess the dataspace of an object
factor_ext_functions

Various functions for factor_ext objects
flatten_df

Flatten a nested data.frame
guess_nelem

Guess the HDF5 datatype of an R object
print.data.frame_ext

Print a data frame with extended factor objects
print_attributes

Print attributes
hdf5r-package

hdf5r: A package to provide an interface to hdf5 from R
hyperslab_to_points

Single hyperslab dimension to explicit vector
H5P_factory

Create an \code\linkH5P out of an id
H5R-class

Class for HDF5 Reference datatypes.
args_regularity_evaluation

Evaluate if the arguments are regular for hyperslab use
factor_ext

Create an extended factor
h5types

These are all types that are used in HDF5
extract_dim

Set the correct dimension attribute for an object
are_args_scalar

Can arguments be interpreted as a scalar?
h5version

Return the version of the HDF5-API
print_class_id

Print the class and ID
text_to_dtype

Convert a text description to a datatype
print_listing

Print listing
standalone_H5S_select_multiple_hyperslab

Select multiple hyperslabs in a space
H5File.open

Open an HDF5 file