PrefixSpan algorithm for mining frequent itemsets.
ml_prefixspan(
x,
seq_col = "sequence",
min_support = 0.1,
max_pattern_length = 10,
max_local_proj_db_size = 3.2e+07,
uid = random_string("prefixspan_"),
...
)ml_freq_seq_patterns(model)
A spark_connection
, ml_pipeline
, or a tbl_spark
.
The name of the sequence column in dataset (default <U+201C>sequence<U+201D>). Rows with nulls in this column are ignored.
The minimum support required to be considered a frequent sequential pattern.
The maximum length of a frequent sequential pattern. Any frequent pattern exceeding this length will not be included in the results.
The maximum number of items allowed in a prefix-projected database before local iterative processing of the projected database begins. This parameter should be tuned with respect to the size of your executors.
A character string used to uniquely identify the ML estimator.
Optional arguments; currently unused.
A Prefix Span model.