Learn R Programming

blink (version 1.1.0)

Record Linkage for Empirically Motivated Priors

Description

An implementation of the model in Steorts (2015) , which performs Bayesian entity resolution for categorical and text data, for any distance function defined by the user. In addition, the precision and recall are in the package to allow one to compare to any other comparable method such as logistic regression, Bayesian additive regression trees (BART), or random forests. The experiments are reproducible and illustrated using a simple vignette. LICENSE: GPL-3 + file license.

Copy Link

Version

Install

install.packages('blink')

Monthly Downloads

281

Version

1.1.0

License

GPL-3

Maintainer

Last Published

October 6th, 2020

Functions in blink (1.1.0)

mpmms

Function to compute a record's MPMMS based on a Gibbs sampler. Note: It returns a list of the MPMMS ($mpmms) and its probability ($prob)
links.compare

This function takes a set of pairwise links and identifies correct, incorrect, and missing links (correct = estimated and true, incorrect = estimated but not true, missing = true but not estimated)
RLdata500

RLdata500
check_IDs

Check whether 2 records which are estimated to be linked have the same IDs
mms

Function to compute a record's Maximal Matching Set (MMS) based on a single linkage structure
rl.gibbs

Gibbs sampler for empirically motivated Bayesian record linkage
links

Function that returns the shared MPMMS (except with an easier condition to code than JASA paper). Function to make a list of vectors of estimated links by "P(MPMMS)>0.5" method Note: The default settings return only MPMMSs with multiple members.
pairwise

Function to take links list that may contain 3-way, 4-way, etc. and reduce it to pairwise only (e.g., a 3-way link 12-45-78 is changed to 2-way links: 12-45, 12-78, 45-78
identity.RLdata500

identity.RLdata500