ml_kmeans

An object coercable to a Spark DataFrame (typically, a
<code>tbl_spark</code>).

The number of cluster centers to compute.

centers

The maximum number of iterations to use.

iter.max

The name of features (terms) to use for the model fit.

features

Whether to compute cost for <code>k-means</code> model using Spark's <a href="https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansModel.scala#L84">computeCost</a>.

compute.cost

Param for the convergence tolerance for iterative algorithms.

tolerance

Optional arguments, used to affect the model generated. See
<code><a rd-options="" href="/link/ml_options?package=sparklyr&version=0.4" data-mini-rdoc="sparklyr::ml_options">ml_options</a></code> for more details.

ml.options

Optional arguments; currently unused.

Perform k-means clustering on a Spark DataFrame.

Provision, connect and interface to Apache Spark from within R.
This package supports connecting to local and remote Apache Spark clusters,
provides a 'dplyr' compatible back-end, and provides an interface to Spark's
built-in machine learning algorithms.

Javier Luraschi

sparklyr

R Interface to Apache Spark

Kevin Ushey

JJ Allaire

 RStudio

 The Apache Software Foundation

ml_kmeans function

Whether to compute cost for <code>k-means</code> model using Spark's <a href='https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansModel.scala#L84'>computeCost</a>.

Optional arguments, used to affect the model generated. See
<code><a rd-options='' href='ml_options'>ml_options</a></code> for more details.

ml_kmeans: Spark ML -- K-Means Clustering

Description

Usage

Arguments

Value

References

See Also