Learn R Programming

datarobot (version 2.18.6)

CreateUserPartition: Create a class partition object for use in the SetTarget function representing a user-defined partition.

Description

Creates a list object used by the SetTarget function to specify either Training/Validation/Holdout (validationType = "TVH") or cross-validation (validationType = "CV") partitions of the modeling dataset based on the values included in a column from the dataset. In either case, the name of this data column must be specified (as userPartitionCol).

Usage

CreateUserPartition(
  validationType,
  userPartitionCol,
  cvHoldoutLevel = NULL,
  trainingLevel = NULL,
  holdoutLevel = NULL,
  validationLevel = NULL
)

Value

An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a user-specified of the modeling dataset.

Arguments

validationType

character. String specifying the type of partition generated, either "TVH" or "CV".

userPartitionCol

character. String naming the data column from the modeling dataset containing the subset designations.

cvHoldoutLevel

character. Data value from userPartitionCol that identifies the holdout subset under the "CV" option.

trainingLevel

character. Data value from userPartitionCol that identifies the training subset under the "TVH" option.

holdoutLevel

character. Data value from userPartitionCol that identifies the holdout subset under both "TVH" and "CV" options. To specify that the project should not use a holdout you can omit this parameter or pass NA directly.

validationLevel

character. Data value from userPartitionCol that identifies the validation subset under the "TVH" option.

Details

For the "TVH" option of cvMethod, no cross-validation is used. Users must specify the trainingLevel and validationLevel; use of a holdoutLevel is always recommended but not required. If no holdoutLevel is used, then the column must contain exactly 2 unique values. If a holdoutLevel is used, the column must contain exactly 3 unique values.

For the "CV" option, each value in the column will be used to separate rows into cross-validation folds. Use of a holdoutLevel is optional; if not specified, then no holdout is used.

This function is one of several convenience functions provided to simplify the task of starting modeling projects with custom partitioning options. The other functions are CreateGroupPartition, CreateRandomPartition, and CreateStratifiedPartition.

See Also

CreateGroupPartition, CreateRandomPartition, CreateStratifiedPartition.

Examples

Run this code
CreateUserPartition(validationType = "CV", userPartitionCol = "TVHflag", cvHoldoutLevel = NA)

Run the code above in your browser using DataLab