This function is a convenient shorthand to start a project and set the target.
See SetupProject
and SetTarget
.
StartProject(
dataSource,
projectName = NULL,
target,
metric = NULL,
weights = NULL,
partition = NULL,
mode = NULL,
seed = NULL,
targetType = NULL,
positiveClass = NULL,
blueprintThreshold = NULL,
responseCap = NULL,
featurelistId = NULL,
smartDownsampled = NULL,
majorityDownsamplingRate = NULL,
accuracyOptimizedBlueprints = NULL,
offset = NULL,
exposure = NULL,
eventsCount = NULL,
monotonicIncreasingFeaturelistId = NULL,
monotonicDecreasingFeaturelistId = NULL,
onlyIncludeMonotonicBlueprints = FALSE,
workerCount = NULL,
wait = FALSE,
checkInterval = 20,
timeout = NULL,
username = NULL,
password = NULL,
verbosity = 1,
maxWait = 600
)
object. Either (a) the name of a CSV file, (b) a dataframe or (c) url to a publicly available file; in each case, this parameter identifies the source of the data from which all project models will be built. See Details.
character. Optional. String specifying a project name.
character. String giving the name of the response variable to be predicted by all project models.
character. Optional. String specifying the model fitting metric to be optimized; a list of valid options for this parameter, which depends on both project and target, may be obtained with the function GetValidMetrics.
character. Optional. String specifying the name of the column from the modeling dataset to be used as weights in model fitting.
partition. Optional. S3 object of class 'partition' whose elements specify a valid partitioning scheme. See help for functions CreateGroupPartition, CreateRandomPartition, CreateStratifiedPartition, CreateUserPartition and CreateDatetimePartitionSpecification.
character. Optional. Specifies the autopilot mode used to start the
modeling project; See AutopilotMode
for valid options; AutopilotMode$Quick
is
default.
integer. Optional. Seed for the random number generator used in creating random partitions for model fitting.
character. Optional. Used to specify the targetType to use for a project.
Valid options are "Binary", "Multiclass", "Regression". Set to "Multiclass" to enable
multiclass modeling. Otherwise, it can help to disambiguate, i.e. telling DataRobot how to
handle a numeric target with a few unique values that could be used for either multiclass
or regression. See TargetType
for an easier way to keep track of the options.
character. Optional. Target variable value corresponding to a positive response in binary classification problems.
integer. Optional. The maximum time (in hours) that any modeling blueprint is allowed to run before being excluded from subsequent autopilot stages.
numeric. Optional. Floating point value, between 0.5 and 1.0, specifying a capping limit for the response variable. The default value NULL corresponds to an uncapped response, equivalent to responseCap = 1.0.
numeric. Specifies which feature list to use. If NULL (default), a default featurelist is used.
logical. Optional. Whether to use smart downsampling to throw away excess rows of the majority class. Only applicable to classification and zero-boosted regression projects.
numeric. Optional. Floating point value, between 0.0 and 100.0. The percentage of the majority rows that should be kept. Specify only if using smart downsampling. May not cause the majority class to become smaller than the minority class.
logical. Optional. When enabled, accuracy optimized blueprints will run in autopilot for the project. These are longer-running model blueprints that provide increased accuracy over normal blueprints that run during autopilot.
character. Optional. Vector of the names of the columns containing the offset of each row.
character. Optional. The name of a column containing the exposure of each row.
character. Optional. The name of a column specifying the events count.
character. Optional. The id of the featurelist
that defines the set of features with a monotonically increasing relationship to the
target. If NULL
(default), no such constraints are enforced. When specified, this
will set a default for the project that can be overridden at model submission time if
desired. The featurelist itself can also be passed as this parameter.
character. Optional. The id of the featurelist
that defines the set of features with a monotonically decreasing relationship to the
target. If NULL
(default), no such constraints are enforced. When specified, this
will set a default for the project that can be overridden at model submission time if
desired. The featurelist itself can also be passed as this parameter.
logical. Optional. When TRUE, only blueprints that support enforcing monotonic constraints will be available in the project or selected for the autopilot.
integer. The number of workers to run (default 2). Use "max"
to set
to the maximum number of workers available.
logical. If TRUE
, invokes WaitForAutopilot
to block execution until
the autopilot is complete.
numeric. Optional. Maximum wait (in seconds) between checks that Autopilot is finished. Defaults to 20.
numeric. Optional. Time (in seconds) after which to give up (Default is no timeout). There is an error if Autopilot is not finished before timing out.
character. The username to use for authentication to the database.
character. The password to use for authentication to the database.
numeric. Optional. 0 is silent, 1 or more displays information about progress. Default is 1.
integer. Specifies how many seconds to wait for the server to finish analyzing the target and begin the modeling process. If the process takes longer than this parameter specifies, execution will stop (but the server will continue to process the request).
if (FALSE) {
projectId <- "59a5af20c80891534e3c2bde"
StartProject(iris,
projectName = "iris",
target = "Species",
targetType = TargetType$Multiclass)
}
Run the code above in your browser using DataLab