- x
A spark_connection
, ml_pipeline
, or a tbl_spark
.
- formula
Used when x
is a tbl_spark
. R formula as a character string or a formula. This is used to transform the input dataframe before fitting, see ft_r_formula for details.
- fit_intercept
Boolean; should the model be fit with an intercept term?
- reg_param
Regularization parameter (aka lambda)
- max_iter
The maximum number of iterations to use.
- standardization
Whether to standardize the training features before fitting the model.
- weight_col
The name of the column to use as weights for the model fit.
- tol
Param for the convergence tolerance for iterative algorithms.
- threshold
in binary classification prediction, in range [0, 1].
- aggregation_depth
(Spark 2.1.0+) Suggested depth for treeAggregate (>= 2).
- features_col
Features column name, as a length-one character vector. The column should be single vector column of numeric values. Usually this column is output by ft_r_formula
.
- label_col
Label column name. The column should be a numeric column. Usually this column is output by ft_r_formula
.
- prediction_col
Prediction column name.
- raw_prediction_col
Raw prediction (a.k.a. confidence) column name.
- uid
A character string used to uniquely identify the ML estimator.
- ...
Optional arguments; see Details.