Learn R Programming

arulesCBA (version 1.2.7)

prepareTransactions: Prepare Data for Associative Classification

Description

Converts data.frame into transactions suitable for classification based on association rules.

Usage

prepareTransactions(
  formula,
  data,
  disc.method = "mdlp",
  logical2factor = TRUE,
  match = NULL
)

Value

An object of class arules::transactions from arules with an attribute called "disc_info" that contains information on the used discretization for each column.

Arguments

formula

the formula.

data

a data.frame with the data.

disc.method

Discretization method used to discretize continuous variables if data is a data.frame (default: "mdlp"). See discretizeDF.supervised() for more supervised discretization methods.

logical2factor

logical; if data is a data.frame, should logical columns be recoded as factor with TRUE/FALSE to generate positive and negative items?

match

typically NULL. Only used internally if data is a already a set of transactions.

Author

Michael Hahsler

Details

To convert a data.frame into items in a transaction dataset for classification, the following steps are performed:

  1. All continuous features are discretized using class-based discretization (default is MDLP) and each range is represented as an item.

  2. Factors are converted into items, one item for each level.

  3. Each logical is converted into an item.

  4. If the class variable is a logical, then a negative class item is added.

Steps 1-3 are skipped if data is already a arules::transactions object.

See Also

arules::transactions, transactions2DF().

Other preparation: CBA_ruleset(), discretizeDF.supervised(), mineCARs(), transactions2DF()

Examples

Run this code
# Perform discretization and convert to transactions
data("iris")
iris_trans <- prepareTransactions(Species ~ ., iris)

inspect(head(iris_trans))
itemInfo(iris_trans)

# A negative class item is added for regular transaction data. Here we get the
# items "canned beer=TRUE" and "canned beer=FALSE".
# Note: backticks are needed in formulas with item labels that contain
# a space or special character.
data("Groceries")
g2 <- prepareTransactions(`canned beer` ~ ., Groceries)

inspect(head(g2))
ii <- itemInfo(g2)
ii[ii[["variables"]] == "canned beer", ]

Run the code above in your browser using DataLab