Emails_train

Emails_test

The training dataset includes a set of email subject lines used for classification
of whether the message is spam (unsolicited commercial content) or not.
Many subject lines include subject matter inappropriate for classroom use.
Given the volume of headlines containing such language
(especially for <code>spam == TRUE</code>), user discretion is advised.
This dataset is a random sample of 80% of the emails data.
The testing dataset is a random sample of 20% of the emails data.

datasets

A complement to all editions of *Modern Data
Science with R*
(ISBN: 978-0367191498, publisher URL:
<https://www.routledge.com/Modern-Data-Science-with-R/Baumer-Kaplan-Horton/p/book/9780367191498>).
This package contains data and code to complete exercises and
reproduce examples from the text. It also facilitates connections
to the SQL database server used in the book. All editions of the book are
supported by this package.

Ben Baumer

mdsr

Complement to 'Modern Data Science with R'

Benjamin S. Baumer

Nicholas Horton

Daniel Kaplan

Emails_train function

A data frame with 5,526 rows and 3 variables:<dl>
<dt>ids</dt>
<dd>an integer vector</dd><dt>subjectline</dt>
<dd>a character vector</dd><dt>type</dt>
<dd>a character vector</dd>
</dl>A data frame with 1,382 rows and 3 variables:

Format

Email Train — Emails_train

A data frame with 5,526 rows and 3 variables:<dl>
<dt>ids</dt>
<dd>an integer vector</dd>

<dt>subjectline</dt>
<dd>a character vector</dd>

<dt>type</dt>
<dd>a character vector</dd>


</dl>

A data frame with 1,382 rows and 3 variables:

Emails_train: Email Train

Description

Usage

Arguments

Format

See Also

Examples