Learn R Programming

koRpus (version 0.13-8)

clozeDelete: Transform text into cloze test format

Description

If you feed a tagged text object to this function, its text will be transformed into a format used for cloze deletion tests. That is, by default every fifth word (or as specified by every) will be replaced by a line. You can also set an offset value to specify where to begin.

Usage

clozeDelete(obj, ...)

# S4 method for kRp.text clozeDelete(obj, every = 5, offset = 0, replace.by = "_", fixed = 10)

Arguments

obj

An object of class kRp.text.

...

Additional arguments to the method (as described in this document).

every

Integer numeric, setting the frequency of words to be manipulated. By default, every fifth word is being transformed.

offset

Either an integer numeric, sets the number of words to offset the transformations. Or the special keyword "all", which will cause the method to iterate through all possible offset values and not return an object, but print the results (including the list with changed words).

replace.by

Character, will be used as the replacement for the removed words.

fixed

Integer numberic, defines the length of the replacement (replace.by will be repeated this much times). If set to 0, the replacement wil be as long as the replaced word.

Value

An object of class kRp.text with the added feature diff.

Details

The option offset="all" will not return one single object, but print the results after iterating through all possible offset values.

Examples

Run this code
# NOT RUN {
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )
  tokenized.obj <- clozeDelete(tokenized.obj)
  pasteText(tokenized.obj)

  # diff stats are now part of the object
  hasFeature(tokenized.obj)
  diffText(tokenized.obj)
} else {}
# }

Run the code above in your browser using DataLab