Learn R Programming

fastLink (version 0.6.1)

matchesLink: matchesLink

Description

matchesLink produces two dataframes that store all the pairs that share a pattern that conforms to the an interval of the Fellegi-Sunter weights

Usage

matchesLink(gammalist, nobs.a, nobs.b, em, thresh, n.cores = NULL)

Value

matchesLink returns an nmatches X 2 matrix with the indices of the matches rows in dataset A and dataset B.

Arguments

gammalist

A list of objects produced by either gammaKpar or gammaCKpar.

nobs.a

number of observations in dataset 1

nobs.b

number of observations in dataset 2

em

parameters obtained from the Expectation-Maximization algorithm under the MAR assumption. These estimates are produced by emlinkMARmov

thresh

is the interval of posterior zeta values for the agreements that we want to examine closer. Ranges between 0 and 1. Can be a vector of length 1 (from specified value to 1) or 2 (from first specified value to second specified value).

n.cores

Number of cores to parallelize over. Default is NULL.

Author

Ted Enamorado <ted.enamorado@gmail.com>, Ben Fifield <benfifield@gmail.com>, and Kosuke Imai

Examples

Run this code
if (FALSE) {
## Calculate gammas
g1 <- gammaCKpar(dfA$firstname, dfB$firstname)
g2 <- gammaCKpar(dfA$middlename, dfB$middlename)
g3 <- gammaCKpar(dfA$lastname, dfB$lastname)
g4 <- gammaKpar(dfA$birthyear, dfB$birthyear)

## Run tableCounts
tc <- tableCounts(list(g1, g2, g3, g4), nobs.a = nrow(dfA), nobs.b = nrow(dfB))

## Run EM
em <- emlinkMAR(tc)

## Get matches
ml <- matchesLink(list(g1, g2, g3, g4), nobs.a = nrow(dfA), nobs.b = nrow(dfB),
em = em, thresh = .95)
}

Run the code above in your browser using DataLab