- rows
The number of rows of data to generate.
- cols
The number of columns of data to generate. Excludes the response column if has_response = TRUE
.
- randomize
A logical value indicating whether data values should be randomly generated. This must be TRUE if either categorical_fraction
or integer_fraction
is non-zero.
- value
If randomize = FALSE
, then all real-valued entries will be set to this value.
- real_range
The range of randomly generated real values.
- categorical_fraction
The fraction of total columns that are categorical.
- factors
The number of (unique) factor levels in each categorical column.
- integer_fraction
The fraction of total columns that are integer-valued.
- integer_range
The range of randomly generated integer values.
- binary_fraction
The fraction of total columns that are binary-valued.
- binary_ones_fraction
The fraction of values in a binary column that are set to 1.
- time_fraction
The fraction of randomly created date/time columns.
- string_fraction
The fraction of randomly created string columns.
- missing_fraction
The fraction of total entries in the data frame that are set to NA.
- response_factors
If has_response = TRUE
, then this is the number of factor levels in the response column.
- has_response
A logical value indicating whether an additional response column should be pre-pended to the final H2O data frame. If set to TRUE, the total number of columns will be cols+1
.
- seed
A seed used to generate random values when randomize = TRUE
.
- seed_for_column_types
A seed used to generate random column types when randomize = TRUE
.