Learn R Programming

Access Abbyy Cloud OCR from R

Easily OCR images, barcodes, forms, documents with machine readable zones, e.g. passports, right from R. Get the results in a wide variety of formats, from text files to detailed XMLs with information about bounding boxes, etc.

The package provides access to the Abbyy Cloud OCR SDK API. Details about results of calls to the API can be found here.

Installation

To get the latest version on CRAN:

install.packages("abbyyR")

To get the current development version from GitHub:

# install.packages("devtools")
devtools::install_github("soodoku/abbyyR", build_vignettes = TRUE)

Using abbyyR

To get acquainted with some of the important functions, read the vignettes:

# Overview of the package
vignette("introduction", package = "abbyyR")
# some functions are used along with output
vignette("example", package = "abbyyR")
# how to scrape text from a folder of images
vignette("wiscads", package = "abbyyR")

The final output quality varies by complexity of the layout to resolution to font face etc. To measure the final quality of ocr, you can measure the edit distance to `gold standard' coded sample using recognize. To do quick edit distance based search and replace to fix messy data, you can use turbo search and replace.

License

Scripts are released under the MIT License.

Contributor Code of Conduct

The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.

Copy Link

Version

Install

install.packages('abbyyR')

Monthly Downloads

133

Version

0.5.5

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

June 25th, 2019

Functions in abbyyR (0.5.5)

processPhotoId

Process Photo ID
processMRZ

Process MRZ: Extract data from Machine Readable Zone
processImage

Process Image
submitImage

Submit Image
processBusinessCard

Process Business Card
processCheckmarkField

processCheckmarkField
ocrFile

OCR File
getTaskStatus

Get Task Status
processDocument

Process Document
processFields

Process Fields
processTextField

Process Text Field
processBarcodeField

Process Bar Code Field
setapp

Sets Application ID and Password
processRemoteImage

Process Remote Image
abbyy_POST

POST
listFinishedTasks

List Finished Tasks
abbyyR-package

abbyyR: R Client for the Abbyy Cloud OCR
deleteTask

Delete Task
getResults

Get Results
abbyy_check

Request Response Verification
abbyy_GET

Base POST AND GET functions. Not exported.
getAppInfo

Get Application Info
listTasks

List Tasks