Learn R Programming

translateR (version 1.0)

enron: Small subset of Enron email corpus

Description

This data set was constructed from a very small subset of the Enron email corpus (Klimt & Yang, 2004). A large set of email messages was made public during the legal investigation concerning the Enron corporation. The full corpus contained 619,446 emails from 158 users. This data set contains only ten emails and includes the body of the email, the email's subject line, and the date.

Usage

data(enron)

Arguments

Format

A data frame with 10 observations on the following 3 variables.
email
A character vector of the email's body.
date
The email's timestamp as a 'Date' type.
subject
A character vector containing the email's subject line.

Source

Klimt, Bryan, and Yiming Yang. "The enron corpus: A new dataset for email classification research." In Machine learning: ECML 2004, pp. 217-226. Springer Berlin Heidelberg, 2004.

Examples

Run this code
## Not run: 
# # Load example data. Three columns, the text
# # content ('email') and two metadata
# # fields (date and subject)
# data(enron)
# 
# # Google, translate column in dataset
# google.dataset.out <- translate(dataset = enron,
#                                 content.field = 'email',
#                                 google.api.key = my.api.key,
#                                 source.lang = 'en',
#                                 target.lang = 'de')
# 
# # Google, translate vector
# google.vector.out <- translate(content.vec = enron$email,
#                                google.api.key = my.api.key,
#                                source.lang = 'en',
#                                target.lang = 'de')
# 
# # Microsoft, translate column in dataset
# google.dataset.out <- translate(dataset = enron,
#                                 content.field = 'email',
#                                 microsoft.client.id = my.client.id,
#                                 microsoft.client.secret =
#                                           my.client.secret,
#                                 source.lang = 'en',
#                                 target.lang = 'de')
# 
# # Microsoft, translate vector
# google.vector.out <- translate(content.vec = enron$email,
#                                microsoft.client.id = my.client.id,
#                                microsoft.client.secret =
#                                          my.client.secret,
#                                source.lang = 'en',
#                                target.lang = 'de')
# ## End(Not run)

Run the code above in your browser using DataLab