Learn R Programming

toaster (version 0.5.5)

createWordcloud: Create Word Cloud Visualization.

Description

Wrapper around wordcloud function that optionally saves graphics to the file of one of supported formats.

Usage

createWordcloud(words, freq, title = "Wordcloud", scale = c(8, 0.2), minFreq = 10, maxWords = 40, filename, format = c("png", "bmp", "jpeg", "tiff", "pdf"), width = 480, height = 480, units = "px", palette = brewer.pal(8, "Dark2"), titleFactor = 1)

Arguments

words
the words
freq
their frequencies
title
plot title
scale
a vector indicating the range of the size of the words (default c(4,.5))
minFreq
words with frequency below minFreq will not be displayed
maxWords
Maximum number of words to be plotted (least frequent terms dropped).
filename
file name to use where to save graphics
format
format of graphics device to save wordcloud image
width
the width of the output graphics device
height
the height of the output graphics device
units
the units in which height and width are given. Cab be px (pixels, the default), in (inches), cm or mm.
palette
color words from least to most frequent
titleFactor
numeric title character expansion factor; multiplied by par("cex") yields the final title character size. NULL and NA are equivalent to a factor of 1.

Value

nothing

Details

Uses base graphics and worldcloud package to create a word cloud (tag cloud) visual reprsentation of for text data. Function uses 2 vectors of equal lengths: one contains list of words and the other has their frequencies.

Resulting graphics is saved in file in one of available graphical formats (png, bmp, jpeg, tiff, or pdf).

Word Cloud visuals apply to any concept that satisfies following conditions: * each data point (artifact) can be expressed with distinct word or compact text in distinct and self-explanatory fashion and * it assigns each artifact scalar non-negative metric. Given these two conditions we can use Word Clouds to visualize top, bottom or all artifacts in single word cloud visual.

See Also

wordcloud

Examples

Run this code
if(interactive()){
# initialize connection to Dallas database in Aster 
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};
                         server=<dbhost>;port=2406;database=<dbname>;uid=<user>;pwd=<pw>")

stopwords = c("a", "an", "the", "with")

# 2-gram tf-idf on offense table
daypart_tfidf_2gram = computeTfIdf(conn, "public.dallaspoliceall", 
                                   docId="extract('hour' from offensestarttime)::int/6",  
                                   textColumns=c('offensedescription','offensenarrative'),
                                   parser=nGram(2, delimiter='[  \\t\\b\\f\\r:\"]+'),
                                   stopwords=stopwords)

toRace <- function(ch) {
  switch(as.character(ch),
         "M" = "Male",
         "F" = "Female",
         "0" = "Night",
         "1" = "Morning",
         "2" = "Day",
         "3" = "Evening",
         "C" = "C",
         "Unknown")
}
                                  
createDallasWordcloud <- function(tf_df, metric, slice, n, maxWords=25, size=750) {
  words=with(tf_df$rs, tf_df$rs[docid==slice,])
  
  ## palette 
  pal = rev(brewer.pal(8, "Set1"))[c(-3,-1)]
  
  createWordcloud(words$term, words[, metric], maxWords=maxWords, scale=c(4, 0.5), palette=pal, 
                  title=paste("Top ", metric, "Offense", n, "- grams for", toRace(race)),
                  file=paste0('wordclouds/',metric,'_offense_',n,'gram_',toRace(slice),'.png'), 
                  width=size, height=size)
}

createDallasWordcloud(daypart_tfidf_2gram, 'tf_idf', 0, n=2, maxWords=200, size=1300)

}

Run the code above in your browser using DataLab