Learn R Programming

⚠️There's a newer version (1.7-9) of this package.Take me there.

arules --- Mining Association Rules and Frequent Itemsets with R

The arules package for R provides the infrastructure for representing, manipulating and analyzing transaction data and patterns using frequent itemsets and association rules. Also provides a wide range of interest measures and mining algorithms including a interfaces and the code of Borgelt's efficient C implementations of the association mining algorithms Apriori and Eclat.

arules core packages:

  • arules: arules base package with data structures, mining algorithms (APRIORI and ECLAT), interest measures.
  • arulesViz: Visualization of association rules.
  • arulesCBA: Classification algorithms based on association rules (includes CBA).
  • arulesSequences: Mining frequent sequences (cSPADE).

Other related packages:

Additional mining algorithms

  • arulesNBMiner: Mining NB-frequent itemsets and NB-precise rules.
  • opusminer: OPUS Miner algorithm for filtered top-k association discovery.
  • RKEEL: Interface to KEEL's association rule mining algorithm.
  • RSarules: Mining algorithm which randomly samples association rules with one pre-chosen item as the consequent from a transaction dataset.

In-database analytics

  • ibmdbR: IBM in-database analytics for R can calculate association rules from a database table.
  • rfml: Mine frequent itemsets or association rules using a MarkLogic server.

Interface

  • rattle: Provides a graphical user interface for association rule mining.
  • pmml: Generates PMML (predictive model markup language) for association rules.

Classification

  • arc: Alternative CBA implementation.
  • inTrees: Interpret Tree Ensembles provides functions for: extracting, measuring and pruning rules; selecting a compact rule set; summarizing rules into a learner.
  • rCBA: Alternative CBA implementation.
  • qCBA: Quantitative Classification by Association Rules.
  • sblr: Scalable Bayesian rule lists algorithm for classification.

Outlier Detection

Recommendation/Prediction

  • recommenerlab: Supports creating predictions using association rules.

Installation

Stable CRAN version: install from within R with

install.packages("arules")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

library("devtools")
install_github("mhahsler/arules")

Usage

Load package and mine some association rules.

library("arules")
data("Adult")

rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
Parameter specification:
 confidence minval smax arem  aval originalSupport support minlen maxlen target   ext
        0.9    0.1    1 none FALSE            TRUE     0.5      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 24421 

apriori - find association rules with the apriori algorithm
version 4.21 (2004.05.09)        (c) 1996-2004   Christian Borgelt
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[115 item(s), 48842 transaction(s)] done [0.03s].
sorting and recoding items ... [9 item(s)] done [0.00s].
creating transaction tree ... done [0.03s].
checking subsets of size 1 2 3 4 done [0.00s].
writing ... [52 rule(s)] done [0.00s].
creating S4 object  ... done [0.01s].

Show basic statistics.

summary(rules)
set of 52 rules

rule length distribution (lhs + rhs):sizes
 1  2  3  4 
 2 13 24 13 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.000   2.000   3.000   2.923   3.250   4.000 

summary of quality measures:
    support         confidence          lift            count      
 Min.   :0.5084   Min.   :0.9031   Min.   :0.9844   Min.   :24832  
 1st Qu.:0.5415   1st Qu.:0.9155   1st Qu.:0.9937   1st Qu.:26447  
 Median :0.5974   Median :0.9229   Median :0.9997   Median :29178  
 Mean   :0.6436   Mean   :0.9308   Mean   :1.0036   Mean   :31433  
 3rd Qu.:0.7426   3rd Qu.:0.9494   3rd Qu.:1.0057   3rd Qu.:36269  
 Max.   :0.9533   Max.   :0.9583   Max.   :1.0586   Max.   :46560  

mining info:
  data ntransactions support confidence
 Adult         48842     0.5        0.9

Inspect rules with the highest lift.

inspect(head(rules, by = "lift"))
    lhs                               rhs                              support confidence     lift
[1] {sex=Male,                                                                                    
     native-country=United-States} => {race=White}                   0.5415421  0.9051090 1.058554
[2] {sex=Male,                                                                                    
     capital-loss=None,                                                                           
     native-country=United-States} => {race=White}                   0.5113632  0.9032585 1.056390
[3] {race=White}                   => {native-country=United-States} 0.7881127  0.9217231 1.027076
[4] {race=White,                                                                                  
     capital-loss=None}            => {native-country=United-States} 0.7490480  0.9205626 1.025783
[5] {race=White,                                                                                  
     sex=Male}                     => {native-country=United-States} 0.5415421  0.9204803 1.025691
[6] {race=White,                                                                                  
     capital-gain=None}            => {native-country=United-States} 0.7194628  0.9202807 1.025469

Support

Please report bugs here on GitHub. Questions should be posted on stackoverflow and tagged with arules.

References

Christian Buchta.

A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules, 2015, URL: http://michael.hahsler.net/research/association_rules/measures.html.

Copy Link

Version

Install

install.packages('arules')

Monthly Downloads

29,270

Version

1.6-6

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Hahsler

Last Published

May 15th, 2020

Functions in arules (1.6-6)

coverage

Calculate coverage for rules
duplicated

Find Duplicated Elements
DATAFRAME

Data.frame Representation for arules Objects
apriori

Mining Associations with Apriori
associations-class

Class associations - A Set of Associations
discretize

Convert a Continuous Variable into a Categorical Variable
dissimilarity

Dissimilarity Computation
Epub

Epub Data Set
crossTable

Cross-tabulate joint occurrences across pairs of items
[-methods

Methods for "[": Extraction or Subsetting in Package 'arules'
SunBai

The SunBai Data Set
itemCoding

Item Coding --- Conversion between Item Labels and Column IDs
addComplement

Add Complement-items to Transactions
affinity

Computing Affinity Between Items
eclat

Mining Associations with Eclat
itemFrequency

Getting Frequency/Support for Single Items
hits

Computing Transaction Weights With HITS
hierarchy

Support for Item Hierarchies
image

Visual Inspection of Binary Incidence Matrices
is.closed

Find Closed Itemsets
interestMeasure

Calculate Additional Interest Measures
abbreviate

Abbreviate function for item labels in transactions, itemMatrix and associations
length

Getting the Number of Elements
inspect

Display Associations and Transactions in Readable Form
itemsets-class

Class itemsets --- A Set of Itemsets
sample

Random Samples and Permutations
rules-class

Class rules --- A Set of Rules
is.maximal

Find Maximal Itemsets
support

Support Counting for Itemsets
subset

Subsetting Itemsets, Rules and Transactions
is.redundant

Find Redundant Rules
is.superset

Find Super and Subsets
is.significant

Find Significant Rules
itemFrequencyPlot

Creating a Item Frequencies/Support Bar Plot
predict

Model Predictions
read.PMML

Read and Write PMML
itemMatrix-class

Class itemMatrix --- Sparse Binary Incidence Matrix to Represent Sets of Items
proximity-classes

Classes dist, ar\_cross\_dissimilarity and ar\_similarity --- Proximity Matrices
supportingTransactions

Supporting Transactions
read.transactions

Read Transaction Data
match

Value Matching
ruleInduction

Rule Induction from Itemsets
size

Number of Items
setOperations

Set Operations
itemSetOperations

Itemwise Set Operations
sort

Sort Associations
merge

Adding Items to Data
tidLists-class

Class tidLists --- Transaction ID Lists for Items/Itemsets
random.transactions

Simulate a Random Transaction Data Set
transactions-class

Class transactions --- Binary Incidence Matrix for Transactions
unique

Remove Duplicated Elements from a Collection
write

Write Transactions or Associations to a File
weclat

Mining Associations from Weighted Transaction Data with Eclat (WARM)
Mushroom

Mushroom Data Set
LIST

List Representation for Objects Based on Class itemMatrix
APappearance-class

Class APappearance --- Specifying the appearance Argument of Apriori to Implement Rule Templates
Groceries

Groceries Data Set
Income

Income Data Set
AScontrol-classes

Classes AScontrol, APcontrol, ECcontrol --- Specifying the control Argument of apriori() and eclat()
Adult

Adult Data Set
ASparameter-classes

Classes ASparameter, APparameter, ECparameter --- Specifying the parameter Argument of apriori() and eclat()
combine

Combining Objects