Learn R Programming

StratifiedSampling (version 0.4.1)

otmatch: Statistical Matching using Optimal transport

Description

This function computes the statistical matching between two complex survey samples with weighting schemes. The function uses the function transport of the package transport.

Usage

otmatch(
  X1,
  id1,
  X2,
  id2,
  w1,
  w2,
  dist_method = "Euclidean",
  transport_method = "shortsimplex",
  EPS = 1e-09
)

Value

A data.frame that contains the matching. The first two columns contain the unit identities of the two samples. The third column is the final weights. All remaining columns are the matching variables.

Arguments

X1

A matrix, the matching variables of sample 1.

id1

A character or numeric vector that contains the labels of the units in sample 1.

X2

A matrix, the matching variables of sample 2.

id2

A character or numeric vector that contains the labels of the units in sample 1.

w1

A numeric vector that contains the weights of the sample 1, harmonized by the function harmonize.

w2

A numeric vector that contains the weights of the sample 2, harmonized by the function harmonize.

dist_method

A string that specified the distance used by the function dist of the package proxy. Default "Euclidean".

transport_method

A string that specified the distance used by the function transport of the package transport. Default "shortsimplex".

EPS

an numeric scalar to determine if the value is rounded to 0.

Details

All details of the method can be seen in : Raphaël Jauslin and Yves Tillé (2021) <arXiv:2105.08379>.

Examples

Run this code

#--- SET UP
N=1000
p=5
X=array(rnorm(N*p),c(N,p))
EPS= 1e-9

n1=100
n2=200

s1 = sampling::srswor(n1,N)
s2 = sampling::srswor(n2,N)


id1=(1:N)[s1==1]
id2=(1:N)[s2==1]

d1=rep(N/n1,n1)
d2=rep(N/n2,n2)

X1=X[s1==1,]
X2=X[s2==1,]

#--- HARMONIZATION

re=harmonize(X1,d1,id1,X2,d2,id2)
w1=re$w1
w2=re$w2

#--- STATISTICAL MATCHING WITH OT

object = otmatch(X1,id1,X2,id2,w1,w2)


round(colSums(object$weight*object[,4:ncol(object)]),3)
round(colSums(w1*X1),3)
round(colSums(w2*X2),3)

Run the code above in your browser using DataLab