textcleaner
Allows corrections to changes made by textcleaner
.
Some changes may have been made by accident, some changes may have been made
by the automated cleaning, while others may just need to be removed.
This function will correct any changes made in a cleaned textcleaner
object.
correct.changes(textcleaner.obj, dictionary = NULL, incorrect)
A textcleaner
object
Character vector.
Can be a vector of a corpus or any text for comparison.
Dictionary to be used for more efficient text cleaning.
Defaults to NULL
, which will use general.dictionary
Character vector.
A vector of incorrect response(s) to change.
See the object spellcheck$auto
in
textcleaner
output
This function returns a list containing the
following textcleaner
objects, which
have been corrected with the user-provided changes:
A matrix of responses where each row represents a participant
and each column represents a unique response. A response that a participant has provided is a '1
'
and a response that a participant has not provided is a '0
'
A list containing two objects:
clean.resp A response matrix that has been spell-checked and de-pluralized with duplicates removed. This can be used as a final dataset for analyses (e.g., fluency of responses)
orig.resp The original response matrix that has had white spaces before and after words response. Also converts all upper-case letters to lower case
A list containing three objects:
full
All responses regardless of spell-checking changes
auto
Only the incorrect responses that were changed during spell-check
A list containing two objects:
rows
Identifies removed participants by their row (or column) location in the original data file
ids
Identifies removed participants by their ID (see argument data
)
A list where each participant is a list index with each
response that was been changed. Participants are identified by their ID (see argument data
).
This can be used to replicate the cleaning process and to keep track of changes more generally.
Participants with NA
did not have any changes from their original data
and participants with missing data are removed (see removed$ids
)
This function is used to correct mistakes that occur
in the cleaning process during textcleaner
.
There are times when you are too deep into the text cleaning process
that accidentally hitting a '1
' instead of a '2
' does
not make sense to stop and start the text cleaning process over. Rather
when mistakes are made, a record can be kept and this function will
allow those mistakes to be amended.
Incorrect responses should be used as input. A menu will prompt the user for their decision on how to manage the incorrectly cleaned response. There are three potential options:
1: TYPE MY OWN
Allows user to type their own response. If multiple responses, then
commas should separate each response. Quotations are not necessary.
2: GOOGLE IT
"Googles" the response in question.
A browser will open with the Google search terms: define "RESPONSE"
3: BAD RESPONSE
When selected, NA
will be returned
# NOT RUN {
# Toy example
raw <- open.animals[c(1:10),-c(1:3)]
# Clean and prepocess data
clean <- textcleaner(raw, partBY = "row", dictionary = "animals")
# Correct changes
if(interactive())
{corr.clean <- correct.changes(clean, incorrect = "rat", dictionary = "animals")}
# }
Run the code above in your browser using DataLab