Creates a plot of the counts/proportion of given wordgroups (wordlist
)
in the subcorpus. The counts/proportion can be calculated on document or word
level - with an 'and' or 'or' link - and additionally can be normalised by
a subcorporus, which could be specified by id
.
plotFreq(
object,
id = names(object$text),
type = c("docs", "words"),
wordlist,
link = c("and", "or"),
wnames,
ignore.case = FALSE,
rel = FALSE,
mark = TRUE,
unit = "month",
curves = c("exact", "smooth", "both"),
smooth = 0.05,
both.lwd,
both.lty,
main,
xlab,
ylab,
ylim,
col,
legend = "topright",
natozero = TRUE,
file,
...
)
A plot.
Invisible: A dataframe with columns date
and wnames
- and
additionally columns wnames_rel
for rel = TRUE
- with the
counts (and proportion) of the given wordgroups.
textmeta
object with strictly tokenized
text
component (character
vectors) - like a result of
cleanTexts
character
vector (default: object$meta$id
) which IDs
specify the subcorpus
character
(default: "docs"
) should counts/proportion
of documents, where every "docs"
or words "words"
be plotted
list of character
vectors. Every list element is an 'or'
link, every character
string in a vector is linked by the argument
link
. If wordlist
is only a character
vector it will be
coerced to a list of the same length as the vector (see as.list
),
so that the argument link
has no effect. Each character
vector
as a list element represents one curve in the outcoming plot
character
(default: "and"
) should the (inner)
character
vectors of each list element be linked by an "and"
or an "or"
character
vector of same length as wordlist
- labels for every group of 'and' linked words
logical
(default: FALSE
) option
from grepl
.
logical
(default: FALSE
) should counts
(FALSE
) or proportion (TRUE
) be plotted
logical
(default: TRUE
) should years be marked by
vertical lines
character
(default: "month"
) to which unit should
dates be floored. Other possible units are "bimonth"
, "quarter"
, "season"
,
"halfyear"
, "year"
, for more units see round_date
character
(default: "exact"
) should "exact"
,
"smooth"
curve or "both"
be plotted
numeric
(default: 0.05
) smoothing parameter
which is handed over to lowess
as f
graphical parameter for smoothed values
if curves = "both"
graphical parameter for smoothed values
if curves = "both"
character
graphical parameter
character
graphical parameter
character
graphical parameter
(default if rel = TRUE
: c(0, 1)
) graphical parameter
graphical parameter, could be a vector. If curves = "both"
the function will for every wordgroup plot at first the exact and then the
smoothed curve - this is important for your col order.
character
(default: "topright") value(s) to specify the
legend coordinates. If "none" no legend is plotted.
logical
(default: TRUE
) should NAs be coerced
to zeros. Only has effect if rel = TRUE
.
character
file path if a pdf should be created
additional graphical parameters
if (FALSE) {
data(politics)
poliClean <- cleanTexts(politics)
plotFreq(poliClean, wordlist=c("obama", "bush"))
}
Run the code above in your browser using DataLab