Update an AKS web service with the provided properties. You can update the web service to use a new model, a new entry script, or new dependencies that can be specified in an inference configuration.
Values left as NULL
will remain unchanged in the web service.
update_aks_webservice(
webservice,
autoscale_enabled = NULL,
autoscale_min_replicas = NULL,
autoscale_max_replicas = NULL,
autoscale_refresh_seconds = NULL,
autoscale_target_utilization = NULL,
auth_enabled = NULL,
cpu_cores = NULL,
memory_gb = NULL,
enable_app_insights = NULL,
scoring_timeout_ms = NULL,
replica_max_concurrent_requests = NULL,
max_request_wait_time = NULL,
num_replicas = NULL,
tags = NULL,
properties = NULL,
description = NULL,
models = NULL,
inference_config = NULL,
gpu_cores = NULL,
period_seconds = NULL,
initial_delay_seconds = NULL,
timeout_seconds = NULL,
success_threshold = NULL,
failure_threshold = NULL,
namespace = NULL,
token_auth_enabled = NULL
)
The AksWebservice
object.
If TRUE
enable autoscaling for the web service.
An int of the minimum number of containers to use when autoscaling the web service.
An int of the maximum number of containers to use when autoscaling the web service.
An int of how often in seconds the autoscaler should attempt to scale the web service.
An int of the target utilization (in percent out of 100) the autoscaler should attempt to maintain for the web service.
If TRUE
enable key-based authentication for the
web service. Defaults to TRUE
.
The number of cpu cores to allocate for
the web service. Can be a decimal. Defaults to 0.1
.
The amount of memory (in GB) to allocate for
the web service. Can be a decimal. Defaults to 0.5
.
If TRUE
enable AppInsights for the web service.
Defaults to FALSE
.
An int of the timeout (in milliseconds) to enforce for scoring calls to the web service.
An int of the number of maximum concurrent requests per node to allow for the web service.
An int of the maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error.
An int of the number of containers to allocate for the web service. If this parameter is not set then the autoscaler is enabled by default.
A named list of key-value tags for the web service,
e.g. list("key" = "value")
. Will replace existing tags.
A named list of key-value properties to add for the web
service, e.g. list("key" = "value")
.
A string of the description to give the web service.
A list of Model
objects to package into the updated service.
An InferenceConfig
object.
An int of the number of gpu cores to allocate for the web service.
An int of how often in seconds to perform the
liveness probe. Minimum value is 1
.
An int of the number of seconds after the container has started before liveness probes are initiated.
An int of the number of seconds after which the
liveness probe times out. Minimum value is 1
.
An int of the minimum consecutive successes
for the liveness probe to be considered successful after having failed.
Minimum value is 1
.
An int of the number of times Kubernetes will try
the liveness probe when a Pod starts and the probe fails, before giving up.
Minimum value is 1
.
A string of the Kubernetes namespace in which to deploy the web service: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first last characters cannot be hyphens.
If TRUE
, enable token-based authentication for
the web service. If enabled, users can access the web service by fetching
an access token using their Azure Active Directory credentials.
Both token_auth_enabled
and auth_enabled
cannot be set to TRUE
.
None