Creates a DataSource
from a database hosted on an Amazon Redshift cluster. A DataSource
references data that can be used to perform either create_ml_model
, create_evaluation
, or create_batch_prediction
operations.
See https://www.paws-r-sdk.com/docs/machinelearning_create_data_source_from_redshift/ for full documentation.
machinelearning_create_data_source_from_redshift(
DataSourceId,
DataSourceName = NULL,
DataSpec,
RoleARN,
ComputeStatistics = NULL
)
[required] A user-supplied ID that uniquely identifies the DataSource
.
A user-supplied name or description of the DataSource
.
[required] The data specification of an Amazon Redshift DataSource
:
DatabaseInformation -
DatabaseName
- The name of the Amazon Redshift database.
ClusterIdentifier
- The unique ID for the Amazon Redshift
cluster.
DatabaseCredentials - The AWS Identity and Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
SelectSqlQuery - The query that is used to retrieve the observation
data for the Datasource
.
S3StagingLocation - The Amazon Simple Storage Service (Amazon S3)
location for staging Amazon Redshift data. The data retrieved from
Amazon Redshift using the SelectSqlQuery
query is stored in this
location.
DataSchemaUri - The Amazon S3 location of the DataSchema
.
DataSchema - A JSON string representing the schema. This is not
required if DataSchemaUri
is specified.
DataRearrangement - A JSON string that represents the splitting and
rearrangement requirements for the DataSource
.
Sample -
"{\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
[required] A fully specified role Amazon Resource Name (ARN). Amazon ML assumes the role on behalf of the user to create the following:
A security group to allow Amazon ML to execute the SelectSqlQuery
query on an Amazon Redshift cluster
An Amazon S3 bucket policy to grant Amazon ML read/write permissions
on the S3StagingLocation
The compute statistics for a DataSource
. The statistics are generated
from the observation data referenced by a DataSource
. Amazon ML uses
the statistics internally during MLModel
training. This parameter must
be set to true
if the DataSource
needs to be used for MLModel
training.