arulesNBMiner - Mining NB-Frequent Itemsets and NB-Precise Rules - R package
This R package extends package arules with NBMiner, an implementation of the model-based mining algorithm for mining NB-frequent itemsets presented in "Michael Hahsler. A model-based frequency constraint for mining associations from transaction data. Data Mining and Knowledge Discovery, 13(2):137-166, September 2006." In addition an extension for NB-precise rules is implemented.
Installation
Stable CRAN version: install from within R with
install.packages("arulesNBMiner")
Current development version: Download package from AppVeyor or install from GitHub (needs devtools).
install_git("mhahsler/arulesNBMiner")
Usage
Estimate NBD model parameters
library(arulesNBMiner)
data("Agrawal")
param <- NBMinerParameters(Agrawal.db, pi=0.99, theta=0.5, maxlen=5,
minlen=1, trim = 0, verb = TRUE, plot=TRUE)
using Expectation Maximization for missing zero class
iteration = 1 , zero class = 2 , k = 1.08506 , m = 278.7137
total items = 716
Mine NB-frequent itemsets
itemsets_NB <- NBMiner(Agrawal.db, parameter = param,
control = list(verb = TRUE, debug=FALSE))
parameter specification:
pi theta n k a minlen maxlen rules
0.99 0.5 716 1.08506 0.001515447 1 5 FALSE
algorithmic control:
verbose debug
TRUE FALSE
Depth-first NB-frequent itemset miner by Michael Hahsler
Database with 20000 transactions and 1000 unique items
3507 NB-frequent itemsets found.
inspect(head(itemsets_NB))
items precision
1 {item494,item525,item572,item765,item775} 1.0000000
2 {item398,item490,item848} 1.0000000
3 {item292,item793,item816} 1.0000000
4 {item229,item780} 0.9964852
5 {item111,item149,item715} 1.0000000
6 {item91,item171,item902} 1.0000000
References
- Michael Hahsler, A model-based frequency constraint for mining associations from transaction data. Data Mining and Knowledge Discovery, 13(2):137-166, September 2006. Free preprint
- Michael Hahsler, Sudheer Chelluboina, Kurt Hornik, and Christian Buchta. The arules R-package ecosystem: Analyzing interesting patterns from large transaction datasets. Journal of Machine Learning Research, 12:1977-1981, 2011.