An Estimator wraps run configuration information for specifying details
of executing an R script. Running an Estimator experiment
(using submit_experiment()
) will return a ScriptRun
object and
execute your training script on the specified compute target.
To define the environment to use for training, you can either directly
provide the environment-related parameters (e.g. cran_packages
,
custom_docker_image
) to estimator()
, or you can provide an
Environment
object to the environment
parameter. For more information
on the predefined Docker images that are used for training if
custom_docker_image
is not specified, see the documentation
here.
estimator(
source_directory,
compute_target = NULL,
vm_size = NULL,
vm_priority = NULL,
entry_script = NULL,
script_params = NULL,
cran_packages = NULL,
github_packages = NULL,
custom_url_packages = NULL,
custom_docker_image = NULL,
image_registry_details = NULL,
use_gpu = FALSE,
environment_variables = NULL,
shm_size = NULL,
max_run_duration_seconds = NULL,
environment = NULL,
inputs = NULL
)
A string of the local directory containing experiment configuration and code files needed for the training job.
The AmlCompute
object for the compute target
where training will happen.
A string of the VM size of the compute target that will be
created for the training job. The list of available VM sizes
are listed here.
Provide this parameter if you want to create AmlCompute as the compute target
at run time, instead of providing an existing cluster to the compute_target
parameter. If vm_size
is specified, a single-node cluster is automatically
created for your run and is deleted automatically once the run completes.
A string of either 'dedicated'
or 'lowpriority'
to
specify the VM priority of the compute target that will be created for the
training job. Defaults to 'dedicated'
. This takes effect only when the
vm_size
parameter is specified.
A string representing the relative path to the file used to start training.
A named list of the command-line arguments to pass to
the training script specified in entry_script
.
A list of cran_package
objects to be installed.
A list of github_package
objects to be installed.
A character vector of packages to be installed from local directory or custom URL.
A string of the name of the Docker image from
which the image to use for training will be built. If not set, a predefined
image will be used as the base image. To use an image from a
private Docker repository, you will also have to specify the
image_registry_details
parameter.
A ContainerRegistry
object of the details of
the Docker image registry for the custom Docker image.
Indicates whether the environment to run the experiment should
support GPUs. If TRUE
, a predefined GPU-based Docker image will be used in the
environment. If FALSE
, a predefined CPU-based image will be used. Predefined
Docker images (CPU or GPU) will only be used if the custom_docker_image
parameter
is not set.
A named list of environment variables names and values. These environment variables are set on the process where the user script is being executed.
A string for the size of the Docker container's shared
memory block. For more information, see
Docker run reference.
If not set, a default value of '2g'
is used.
An integer of the maximum allowed time for the run. Azure ML will attempt to automatically cancel the run if it takes longer than this value.
The Environment
object that configures the R
environment where the experiment is executed. This parameter is mutually
exclusive with the other environment-related parameters custom_docker_image
, image_registry_details
, use_gpu
, environment_variables
, shm_size
,
cran_packages
, github_packages
, and custom_url_packages
and if set
will take precedence over those parameters.
A list of DataReference objects or DatasetConsumptionConfig objects to use as input.
The Estimator
object.
r_env <- r_environment(name = "r-env", cran_packages = list(cran_package("dplyr"), cran_package("ggplot2"))) est <- estimator(source_directory = ".", entry_script = "train.R", compute_target = compute_target, environment = r_env)
r_environment()
, container_registry()
, submit_experiment()
,
dataset_consumption_config()
, cran_package()