Learn R Programming

simPop (version 2.1.3)

simComponents: Simulate components of continuous variables of population data

Description

Simulate components of continuous variables of population data by resampling fractions from survey data. The continuous variable to be split and any categorical conditioning variables need to be simulated beforehand.

Usage

simComponents(
  simPopObj,
  total = "netIncome",
  components = c("py010n", "py050n", "py090n", "py100n", "py110n", "py120n", "py130n",
    "py140n"),
  conditional = c(getCatName(total), "pl030"),
  replaceEmpty = c("sequential", "min"),
  seed
)

Value

An object of class simPopObj containing survey data as well as the simulated population data including the components of the continuous variable specified by total and components.

Arguments

simPopObj

a simPopObj-object.

total

a character string specifying the continuous variable of dataP that should be split into components. Currently, only one variable can be split at a time.

components

a character vector specifying the components in dataS that should be simulated for the population data.

conditional

an optional character vector specifying categorical conditioning variables for resampling. The fractions occurring in dataS are then drawn from the respective subsets defined by these variables.

replaceEmpty

a character string; if conditional specifies at least two conditioning variables, this determines how replacement cells for empty subsets in the sample are obtained. If "sequential", the conditioning variables are browsed sequentially such that replacement cells have the same value in one conditioning variable and minimum Manhattan distance in the other conditioning variables. If no such cells exist, replacement cells with minimum overall Manhattan distance are selected. The latter is always done if this is "min" or only one conditioning variable is used.

seed

optional; an integer value to be used as the seed of the random number generator, or an integer vector containing the state of the random number generator to be restored.

Author

Stefan Kraft and Andreas Alfons and Bernhard Meindl

References

B. Meindl, M. Templ, A. Kowarik, O. Dupriez (2017) Simulation of Synthetic Populations for Survey Data Considering Auxiliary Information. Journal of Statistical Survey, 79 (10), 1--38. tools:::Rd_expr_doi("10.18637/jss.v079.i10")

A. Alfons, M. Templ (2011) Simulation of close-to-reality population data for household surveys with application to EU-SILC. Statistical Methods & Applications, 20 (3), 383--407. tools:::Rd_expr_doi("10.1080/02664763.2013.859237")

See Also

simStructure, simCategorical, simContinuous, simEUSILC

Examples

Run this code
data(eusilcS)
if (FALSE) {
## approx. 20 seconds computation time
inp <- specifyInput(data=eusilcS, hhid="db030", hhsize="hsize",
  strata="db040", weight="db090")
simPopObj <- simStructure(data=inp, method="direct",
  basicHHvars=c("age", "rb090", "hsize", "pl030", "pb220a"))
simPopObj <- simContinuous(simPopObj, additional = "netIncome",
  regModel = ~rb090+hsize+pl030+pb220a+hsize,
  method="multinom", upper=200000, equidist=FALSE, nr_cpus=1)

# categorize net income for use as conditioning variable
sIncome <- manageSimPopObj(simPopObj, var="netIncome", sample=TRUE, set=FALSE)
sWeight <- manageSimPopObj(simPopObj, var="rb050", sample=TRUE, set=FALSE)
pIncome <- manageSimPopObj(simPopObj, var="netIncome", sample=FALSE, set=FALSE)

breaks <- getBreaks(x=unlist(sIncome), w=unlist(sWeight), upper=Inf, equidist=FALSE)
simPopObj <- manageSimPopObj(simPopObj, var="netIncomeCat", sample=TRUE,
  set=TRUE, values=getCat(x=unlist(sIncome), breaks))
simPopObj <- manageSimPopObj(simPopObj, var="netIncomeCat", sample=FALSE,
  set=TRUE, values=getCat(x=unlist(pIncome), breaks))

# simulate net income components
simPopObj <- simComponents(simPopObj=simPopObj, total="netIncome",
  components=c("py010n","py050n","py090n","py100n","py110n","py120n","py130n","py140n"),
  conditional = c("netIncomeCat", "pl030"), replaceEmpty = "sequential", seed=1 )

class(simPopObj)
}

Run the code above in your browser using DataLab