Learn R Programming

StatMatch (version 1.2.0)

create.fused: Creates a matched (synthetic) dataset

Description

Creates a synthetic data frame after the statistical matching of two data sources at micro level.

Usage

create.fused(data.rec, data.don, mtc.ids, 
                z.vars, dup.x=FALSE, match.vars=NULL)

Arguments

data.rec
A matrix or data frame that plays the role of recipient in the statistical matching application.
data.don
A matrix or data frame that that plays the role of donor in the statistical matching application.
mtc.ids
A matrix with two columns. Each row must contain the name or the index of the recipient record (row) in data.don and the name or the index of the corresponding donor record (row) in data.don. Note that this type of matrix is re
z.vars
A character vector with the names of the variables available only in data.don that should be donated to data.rec.
dup.x
Logical. When TRUE the values of the matching variables in data.don are also donated to data.rec. The names of the matching variables have to be specified with the argument match.vars.
match.vars
A character vector with the names of the matching variables. It has to be specified only when dup.x=TRUE.

Value

  • The data frame data.rec with the z.vars filled in and, when dup.x=TRUE, with the values of the matching variables match.vars observed on the donor records.

Details

This function allows to create the synthetic (or fused) data set after the application of a statistical matching in a micro framework. For details see D'Orazio et al. (2006).

References

D'Orazio, M., Di Zio, M. and Scanu, M. (2006). Statistical Matching: Theory and Practice. Wiley, Chichester.

See Also

NND.hotdeck RANDwNND.hotdeck rankNND.hotdeck

Examples

Run this code
lab <- c(1:15, 51:65, 101:115)
iris.rec <- iris[lab, c(1:3,5)]  # recipient data.frame
iris.don <- iris[-lab, c(1:2,4:5)] # donor data.frame

# Now iris.rec and iris.don have the variables
# "Sepal.Length", "Sepal.Width"  and "Species"
# in common.
#  "Petal.Length" is available only in iris.rec
#  "Petal.Width"  is available only in iris.don

# find the closest donors using NND hot deck;
# distances are computed on "Sepal.Length" and "Sepal.Width"

out.NND <- NND.hotdeck(data.rec=iris.rec, data.don=iris.don,
            match.vars=c("Sepal.Length", "Sepal.Width"), 
            don.class="Species")

# create synthetic data.set, without the 
# duplication of the matching variables

fused.0 <- create.fused(data.rec=iris.rec, data.don=iris.don, 
            mtc.ids=out.NND$mtc.ids, z.vars="Petal.Width")

# create synthetic data.set, with the "duplication" 
# of the matching variables

fused.1 <- create.fused(data.rec=iris.rec, data.don=iris.don,
            mtc.ids=out.NND$mtc.ids, z.vars="Petal.Width",
            dup.x=TRUE, match.vars=c("Sepal.Length", "Sepal.Width"))

Run the code above in your browser using DataLab