Learn R Programming

simFrame (version 0.5.4)

runSimulation: Run a simulation experiment

Description

Generic function for running a simulation experiment.

Usage

runSimulation(x, setup, nrep, control, contControl = NULL,
              NAControl = NULL, design = character(), fun, …,
              SAE = FALSE)

runSim(…)

Arguments

x

a data.frame (for design-based simulation or simulation based on real data) or a control object for data generation inheriting from "VirtualDataControl" (for model-based simulation or mixed simulation designs).

setup

an object of class "SampleSetup", containing previously set up samples, or a control class for setting up samples inheriting from "VirtualSampleControl".

nrep

a non-negative integer giving the number of repetitions of the simulation experiment (for model-based simulation, mixed simulation designs or simulation based on real data).

control

a control object of class "SimControl"

contControl

an object of a class inheriting from "VirtualContControl", controlling contamination in the simulation experiment.

NAControl

an object of a class inheriting from "VirtualNAControl", controlling the insertion of missing values in the simulation experiment.

design

a character vector specifying variables (columns) to be used for splitting the data into domains. The simulations, including contamination and the insertion of missing values (unless SAE=TRUE), are then performed on every domain.

fun

a function to be applied in each simulation run.

for runSimulation, additional arguments to be passed to fun. For runSim, arguments to be passed to runSimulation.

SAE

a logical indicating whether small area estimation will be used in the simulation experiment.

Value

An object of class "SimResults".

Methods

x = "ANY", setup = "ANY", nrep = "ANY", control = "missing"

convenience wrapper that allows the slots of control to be supplied as arguments

x = "data.frame", setup = "missing", nrep = "missing", control = "SimControl"

run a simulation experiment based on real data without repetitions (probably useless, but for completeness).

x = "data.frame", setup = "missing", nrep = "numeric", control = "SimControl"

run a simulation experiment based on real data with repetitions.

x = "data.frame", setup = "SampleSetup", nrep = "missing", control = "SimControl"

run a design-based simulation experiment with previously set up samples.

x = "data.frame", setup = "VirtualSampleControl", nrep = "missing", control = "SimControl"

run a design-based simulation experiment.

x = "VirtualDataControl", setup = "missing", nrep = "missing", control = "SimControl"

run a model-based simulation experiment without repetitions (probably useless, but for completeness).

x = "VirtualDataControl", setup = "missing", nrep = "numeric", control = "SimControl"

run a model-based simulation experiment with repetitions.

x = "VirtualDataControl", setup = "VirtualSampleControl", nrep = "missing", control = "SimControl"

run a simulation experiment using a mixed simulation design without repetitions (probably useless, but for completeness).

x = "VirtualDataControl", setup = "VirtualSampleControl", nrep = "numeric", control = "SimControl"

run a simulation experiment using a mixed simulation design with repetitions.

Details

For convenience, the slots of control may be supplied as arguments.

There are some requirements for slot fun of the control object control. The function must return a numeric vector, or a list with the two components values (a numeric vector) and add (additional results of any class, e.g., statistical models). Note that the latter is computationally slightly more expensive. A data.frame is passed to fun in every simulation run. The corresponding argument must be called x. If comparisons with the original data need to be made, e.g., for evaluating the quality of imputation methods, the function should have an argument called orig. If different domains are used in the simulation, the indices of the current domain can be passed to the function via an argument called domain.

For small area estimation, the following points have to be kept in mind. The design for splitting the data must be supplied and SAE must be set to TRUE. However, the data are not actually split into the specified domains. Instead, the whole data set (sample) is passed to fun. Also contamination and missing values are added to the whole data (sample). Last, but not least, the function must have a domain argument so that the current domain can be extracted from the whole data (sample).

In every simulation run, fun is evaluated using try. Hence no results are lost if computations fail in any of the simulation runs.

runSim is a wrapper for runSimulation.

References

Alfons, A., Templ, M. and Filzmoser, P. (2010) An Object-Oriented Framework for Statistical Simulation: The R Package simFrame. Journal of Statistical Software, 37(3), 1--36. 10.18637/jss.v037.i03.

See Also

"'>SimControl", "'>SimResults", simBwplot, simDensityplot, simXyplot

Examples

Run this code
# NOT RUN {
#### design-based simulation
set.seed(12345)  # for reproducibility
data(eusilcP)    # load data

## control objects for sampling and contamination
sc <- SampleControl(size = 500, k = 50)
cc <- DARContControl(target = "eqIncome", epsilon = 0.02,
    fun = function(x) x * 25)

## function for simulation runs
sim <- function(x) {
    c(mean = mean(x$eqIncome), trimmed = mean(x$eqIncome, 0.02))
}

## run simulation and explore results
results <- runSimulation(eusilcP,
    sc, contControl = cc, fun = sim)
head(results)
aggregate(results)
tv <- mean(eusilcP$eqIncome)  # true population mean
plot(results, true = tv)



#### model-based simulation
set.seed(12345)  # for reproducibility

## function for generating data
rgnorm <- function(n, means) {
    group <- sample(1:2, n, replace=TRUE)
    data.frame(group=group, value=rnorm(n) + means[group])
}

## control objects for data generation and contamination
means <- c(0, 0.25)
dc <- DataControl(size = 500, distribution = rgnorm,
    dots = list(means = means))
cc <- DCARContControl(target = "value",
    epsilon = 0.02, dots = list(mean = 15))

## function for simulation runs
sim <- function(x) {
    c(mean = mean(x$value),
        trimmed = mean(x$value, trim = 0.02),
        median = median(x$value))
}

## run simulation and explore results
results <- runSimulation(dc, nrep = 50,
    contControl = cc, design = "group", fun = sim)
head(results)
aggregate(results)
plot(results, true = means)
# }

Run the code above in your browser using DataLab