Learn R Programming

qdap (version 2.2.1)

check_text: Check Text For Potential Problems

Description

Uncleaned text may result in errors, warnings, and incorrect results in subsequent analysis. check_text checks text for potential problems and suggests possible fixes. Potential text anomalies that are detected include: factors, missing ending punctuation, empty cells, double punctuation, non-space after comma, no alphabetic characters, non-ascii, missing value, and potentially misspelled words.

Usage

check_text(text.var, file = NULL)

Arguments

text.var
The text variable.
file
A connection, or a character string naming the file to print to. If NULL prints to the console. Note that this is assigned as an attribute and passed to print.

Value

  • Returns a list with the following potential text faults reports:
    • non_character
    {- Text that is non-character.}
  • missing_ending_punctuation- Text with no endmark at the end of the string.
  • empty- Text that contains an empty element (i.e., "").
  • double_punctuation- Text that contains two qdap punctuation marks in the same string.
  • non_space_after_comma- Text that contains commas with no space after them.
  • no_alpha- Text that contains string elements with no alphabetic characters.
  • non_ascii- Text that contains non-ASCII characters.
  • missing_value- Text that contains missing values (i.e., NA).
  • containing_escaped- Text that contains escaped (see ?Quotes).
  • containing_digits- Text that contains digits.
  • indicating_incomplete- Text that contains endmarks that are indicative of incomplete/trailing sentences (e.g., ...).
  • potentially_misspelled- Text that contains potentially misspelled words.

See Also

check_spelling_interactive

Examples

Run this code
x <- c("i like", "i want. thet them .", "I am ! that|", "", NA,
    "they,were there", ".", "   ", "?", "3;", "I like goud eggs!",
    "i 4like...", "\\tgreat",  "She said \"yes\"")
check_text(x)
print(check_text(x), include.text=FALSE)

y <- c("A valid sentence.", "yet another!")
check_text(y)

Run the code above in your browser using DataLab