h2o.splitFrame

Split an existing H2O data set according to user-specified ratios. The number of
subsets is always 1 more than the number of given ratios. Note that this does not give
an exact split. H2O is designed to be efficient on big data using a probabilistic
splitting method rather than an exact split. For example, when specifying a split of
0.75/0.25, H2O will produce a test/train split with an expected value of 0.75/0.25
rather than exactly 0.75/0.25. On small datasets, the sizes of the resulting splits
will deviate from the expected value more than on big data, where they will be very
close to exact.

R interface for 'H2O', the scalable open source machine learning
platform that offers parallelized implementations of many supervised and
unsupervised machine learning algorithms such as Generalized Linear
Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests,
Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes,
Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection,
Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Tomas Fryda

R Interface for the 'H2O' Scalable Machine Learning Platform

Erin LeDell

Navdeep Gill

Spencer Aiello

Anqi Fu

Arno Candel

Cliff Click

Tom Kraljevic

Tomas Nykodym

Patrick Aboyoun

Michal Kurka

Michal Malohlava

Sebastien Poirier

Wendy Wong

Ludi Rehak

Eric Eckstrand

Brandon Hill

Sebastian Vidrio

Surekha Jadhawani

Amy Wang

Raymond Peck

Jan Gorecki

Matt Dowle

Yuan Tang

Lauren DiPerna

Veronika Maurerova

Yuliia Syzon

Adam Valenta

Marek Novotny

H2O.ai 

h2o.splitFrame function

<dl><dt>data</dt>
<dd>An H2OFrame object, to be split.</dd>
<dt>ratios</dt>
<dd>A numeric value or array indicating the ratio of total rows
contained in each split. Must total up to less than 1. e.g. c(0.8) for 80/20 split.</dd>
<dt>destination_frames</dt>
<dd>An array of frame IDs equal to the number of values
specified in the ratios array, plus one.</dd>
<dt>seed</dt>
<dd>Random seed.</dd></dl>

Arguments

Split an H2O Data Set — h2o.splitFrame

<dl>

<dt>data</dt>
<dd>An H2OFrame object, to be split.</dd>


<dt>ratios</dt>
<dd>A numeric value or array indicating the ratio of total rows
contained in each split. Must total up to less than 1. e.g. c(0.8) for 80/20 split.</dd>


<dt>destination_frames</dt>
<dd>An array of frame IDs equal to the number of values
specified in the ratios array, plus one.</dd>


<dt>seed</dt>
<dd>Random seed.</dd>

</dl>

Split an H2O Data Set

h2o.splitFrame: Split an H2O Data Set

Description

Usage

Value

Arguments

Examples