Spam data set from the UCI machine learning repository (http://archive.ics.uci.edu/ml/datasets/spambase).
Data set collected at Hewlett-Packard Labs to classify emails as spam or non-spam.
57 variables indicate the frequency of certain words and characters in the e-mail.
The positive class is set to "spam".