Learn R Programming

simFrame (version 0.5.4)

simSample: Set up multiple samples

Description

A convenience wrapper for setting up multiple samples using setup with control class '>SampleControl.

Usage

simSample(x, design = character(), grouping = character(), 
          collect = FALSE, fun = srs, size = NULL, 
          prob = NULL, …, k = 1)

Arguments

x

the data.frame to sample from.

design

a character, logical or numeric vector specifying variables (columns) to be used for stratified sampling.

grouping

a character string, single integer or logical vector specifying a grouping variable (column) to be used for sampling whole groups rather than individual observations.

collect

logical; if a grouping variable is specified and this is FALSE (which is the default value), groups are sampled directly. If grouping variable is specified and this is TRUE, individuals are sampled in a first step. In a second step, all individuals that belong to the same group as any of the sampled individuals are collected and added to the sample. If no grouping variable is specified, this is ignored.

fun

a function to be used for sampling (defaults to srs). It should return a vector containing the indices of the sampled items (observations or groups).

size

an optional non-negative integer giving the number of items (observations or groups) to sample. For stratified sampling, a vector of non-negative integers, each giving the number of items to sample from the corresponding stratum.

prob

an optional numeric vector giving the probability weights, or a character string or logical vector specifying a variable (column) that contains the probability weights.

additional arguments to be passed to fun.

k

a single positive integer giving the number of samples to be set up.

Value

An object of class "SampleSetup".

Details

There are some restrictions on the argument names of the function supplied to fun. If it needs population data as input, the corresponding argument should be called x and should expect a data.frame. If the sampling method only needs the population size as input, the argument should be called N. Note that fun is not expected to have both x and N as arguments, and that the latter is much faster for stratified sampling or group sampling. Furthermore, if the function has arguments for sample size and probability weights, they should be called size and prob, respectively. Note that a function with prob as its only argument is perfectly valid (for probability proportional to size sampling). Further arguments of fun may be passed directly via the … argument.

See Also

setup, "'>SampleControl", "'>SampleSetup"

Examples

Run this code
# NOT RUN {
data(eusilcP)

## simple random sampling
srss <- simSample(eusilcP, size = 20, k = 4)
summary(srss)
draw(eusilcP[, c("id", "eqIncome")], srss, i = 1)

## group sampling
gss <- simSample(eusilcP, grouping = "hid", size = 10, k = 4)
summary(gss)
draw(eusilcP[, c("hid", "id", "eqIncome")], gss, i = 2)

## stratified simple random sampling
ssrss <- simSample(eusilcP, design = "region", 
    size = c(2, 5, 5, 3, 4, 5, 3, 5, 2), k = 4)
summary(ssrss)
draw(eusilcP[, c("id", "region", "eqIncome")], ssrss, i = 3)

## stratified group sampling
sgss <- simSample(eusilcP, design = "region", 
    grouping = "hid", size = c(2, 5, 5, 3, 4, 5, 3, 5, 2), k = 4)
summary(sgss)
draw(eusilcP[, c("hid", "id", "region", "eqIncome")], sgss, i = 4)
# }

Run the code above in your browser using DataLab