ml_prefixspan

ml_freq_seq_patterns

A <code>spark_connection</code>, <code>ml_pipeline</code>, or a <code>tbl_spark</code>.

The name of the sequence column in dataset (default
&lt;U+201C&gt;sequence&lt;U+201D&gt;). Rows with nulls in this column are ignored.

seq_col

The minimum support required to be considered a frequent
sequential pattern.

min_support

The maximum length of a frequent sequential
pattern. Any frequent pattern exceeding this length will not be included in
the results.

max_pattern_length

The maximum number of items allowed in a
prefix-projected database before local iterative processing of the
projected database begins. This parameter should be tuned with respect to
the size of your executors.

max_local_proj_db_size

A character string used to uniquely identify the ML estimator.

Optional arguments; currently unused.

model

PrefixSpan algorithm for mining frequent itemsets.

R interface to Apache Spark, a fast and general
engine for big data processing, see <http://spark.apache.org>. This
package supports connecting to local and remote Apache Spark clusters,
provides a 'dplyr' compatible back-end, and provides an interface to
Spark's built-in machine learning algorithms.

Yitao Li

sparklyr

ml_prefixspan: Frequent Pattern Mining -- PrefixSpan

Description

Usage

Arguments