freqItems: Finding frequent items for columns, possibly with false positives
Description
Finding frequent items for columns, possibly with false positives.
Using the frequent element count algorithm described in
https://dl.acm.org/doi/10.1145/762471.762473, proposed by Karp, Schenker,
and Papadimitriou.
Usage
# S4 method for SparkDataFrame,character
freqItems(x, cols, support = 0.01)
Arguments
x
A SparkDataFrame.
cols
A vector column names to search frequent items in.
support
(Optional) The minimum frequency for an item to be considered frequent.
Should be greater than 1e-4. Default support = 0.01.
Value
a local R data.frame with the frequent items in each column