Learn R Programming

abbyyR (version 0.3)

compareText: Compare Text

Description

This function calculates the edit distance between OCR'd and human transcribed file. The function by default prints the status of the task you are trying to delete. It will show up as 'deleted' if successful

Usage

compareText(path_to_ocr = NULL, path_to_gold = NULL,
  remove_extra_space = TRUE, normalize = TRUE)

Arguments

path_to_ocr
path to file containing OCR'd text; required
path_to_gold
path to file containing human transcribed text; required
remove_extra_space
a dummy indicating whether or not extra spaces should be removed from the OCR file; default is TRUE
normalize
add a way to normalize string distance measures -- otherwise longer document means more errors, more distance

Value

  • levenshtein distance

Examples

Run this code
compare_txt(path_to_ocr="path_to_ocr_file", path_to_gold="path_to_gold_file", 
	           remove_extra_space=TRUE)

Run the code above in your browser using DataLab