Learn R Programming

SemNetCleaner (version 1.1.5)

correct.changes: Correct Changes from textcleaner

Description

Allows corrections to changes made by textcleaner. Some changes may have been made by accident, some changes may have been made by the automated cleaning, while others may just need to be removed. This function will correct any changes made in a cleaned textcleaner object.

Usage

correct.changes(textcleaner.obj, dictionary = NULL, incorrect)

Arguments

textcleaner.obj

A textcleaner object

dictionary

Character vector. Can be a vector of a corpus or any text for comparison. Dictionary to be used for more efficient text cleaning. Defaults to NULL, which will use general.dictionary

incorrect

Character vector. A vector of incorrect response(s) to change. See the object spellcheck$auto in textcleaner output

Value

This function returns a list containing the following textcleaner objects, which have been corrected with the user-provided changes:

binary

A matrix of responses where each row represents a participant and each column represents a unique response. A response that a participant has provided is a '1' and a response that a participant has not provided is a '0'

responses

A list containing two objects:

  • clean.resp A response matrix that has been spell-checked and de-pluralized with duplicates removed. This can be used as a final dataset for analyses (e.g., fluency of responses)

  • orig.resp The original response matrix that has had white spaces before and after words response. Also converts all upper-case letters to lower case

spellcheck

A list containing three objects:

  • full All responses regardless of spell-checking changes

  • auto Only the incorrect responses that were changed during spell-check

removed

A list containing two objects:

  • rows Identifies removed participants by their row (or column) location in the original data file

  • ids Identifies removed participants by their ID (see argument data)

partChanges

A list where each participant is a list index with each response that was been changed. Participants are identified by their ID (see argument data). This can be used to replicate the cleaning process and to keep track of changes more generally. Participants with NA did not have any changes from their original data and participants with missing data are removed (see removed$ids)

Details

This function is used to correct mistakes that occur in the cleaning process during textcleaner. There are times when you are too deep into the text cleaning process that accidentally hitting a '1' instead of a '2' does not make sense to stop and start the text cleaning process over. Rather when mistakes are made, a record can be kept and this function will allow those mistakes to be amended.

Incorrect responses should be used as input. A menu will prompt the user for their decision on how to manage the incorrectly cleaned response. There are three potential options:

  • 1: TYPE MY OWN Allows user to type their own response. If multiple responses, then commas should separate each response. Quotations are not necessary.

  • 2: GOOGLE IT "Googles" the response in question. A browser will open with the Google search terms: define "RESPONSE"

  • 3: BAD RESPONSE When selected, NA will be returned

Examples

Run this code
# NOT RUN {
# Toy example
raw <- open.animals[c(1:10),-c(1:3)]

# Clean and prepocess data
clean <- textcleaner(raw, partBY = "row", dictionary = "animals")

# Correct changes
if(interactive())
{corr.clean <- correct.changes(clean, incorrect = "rat", dictionary = "animals")}

# }

Run the code above in your browser using DataLab