Learn R Programming

DAAG (version 1.25.6)

spam7: Spam E-mail Data

Description

The data consist of 4601 email items, of which 1813 items were identified as spam. This is a subset of the full dataset, with six only of the 57 explanatory variables in the complete dataset.

Usage

spam7

Arguments

Format

Columns included are:

crl.tot

total length of uninterrupted sequences of capitals

dollar

Occurrences of `$', as percent of total number of characters

bang

Occurrences of `!', as percent of total number of characters

money

Occurrences of `money', as percent of total number of words

n000

Occurrences of the string `000', as percent of total number of words

make

Occurrences of `make', as % of total number of words

yesno

outcome variable, a factor with levels n not spam, y spam

Examples

Run this code
require(rpart)
spam.rpart <- rpart(formula = yesno ~ crl.tot + dollar + bang +
   money + n000 + make, data=spam7)
plot(spam.rpart)
text(spam.rpart)

Run the code above in your browser using DataLab