Usage
h2o.createFrame(rows = 10000, cols = 10, randomize = TRUE, value = 0, real_range = 100, categorical_fraction = 0.2, factors = 100, integer_fraction = 0.2, integer_range = 100, binary_fraction = 0.1, binary_ones_fraction = 0.02, time_fraction = 0, string_fraction = 0, missing_fraction = 0.01, response_factors = 2, has_response = FALSE, seed, seed_for_column_types)
Arguments
rows
The number of rows of data to generate.
cols
The number of columns of data to generate. Excludes the response column if has_response = TRUE
.
randomize
A logical value indicating whether data values should be randomly generated. This must be TRUE if either categorical_fraction
or integer_fraction
is non-zero.
value
If randomize = FALSE
, then all real-valued entries will be set to this value.
real_range
The range of randomly generated real values.
categorical_fraction
The fraction of total columns that are categorical.
factors
The number of (unique) factor levels in each categorical column.
integer_fraction
The fraction of total columns that are integer-valued.
integer_range
The range of randomly generated integer values.
binary_fraction
The fraction of total columns that are binary-valued.
binary_ones_fraction
The fraction of values in a binary column that are set to 1.
time_fraction
The fraction of randomly created date/time columns.
string_fraction
The fraction of randomly created string columns.
missing_fraction
The fraction of total entries in the data frame that are set to NA.
response_factors
If has_response = TRUE
, then this is the number of factor levels in the response column.
has_response
A logical value indicating whether an additional response column should be pre-pended to the final H2O data frame. If set to TRUE, the total number of columns will be cols+1
.
seed
A seed used to generate random values when randomize = TRUE
.
seed_for_column_types
A seed used to generate random column types when randomize = TRUE
.