Learn R Programming

SciencesPo (version 1.3.9)

stratified: Stratified Sampling

Description

A handy function for sampling row values of a data.frame conditional to some strata.

Usage

stratified(.data, group, size, select = NULL, replace = FALSE,
  both.sets = FALSE)

Arguments

.data
The data.frame from which the sample is desired.
group
The grouping factor, may be a list.
size
The sample size.
select
If sampling from a specific group or list of groups.
replace
Should sampling be with replacement?
both.sets
If TRUE, both `sample` and `.data` are returned.

encoding

UTF-8

Examples

Run this code
# Generate a couple of sample data.frames to play with

set.seed(51)
dat1 <- data.frame(ID = 1:100, A = sample(c("AA", "BB", "CC", "DD", "EE"),
100, replace = TRUE), B = rnorm(100), C = abs(round(rnorm(100), digits = 1)),
D = sample(c("CA", "NY", "TX"), 100, replace = TRUE), E = sample(c("M","F"),
100, replace = TRUE))

# Let's take a 10\% sample from all -A- groups in dat1
 stratified(dat1, "A", 0.1)

 # Let's take a 10\% sample from only 'AA' and 'BB' groups from -A- in dat1
 stratified(dat1, "A", 0.1, select = list(A = c("AA", "BB")))

 # Let's take 5 samples from all -D- groups in dat1, specified by column
stratified(dat1, group = 5, size = 5)

# Let's take a sample from all -A- groups in dat1, where we specify the
# number wanted from each group
stratified(dat1, "A", size = c(3, 5, 4, 5, 2))

# Use a two-column strata (-E- and -D-) but only interested in cases where
# -E- == 'M'
stratified(dat1, c("E", "D"), 0.15, select = list(E = "M"))

Run the code above in your browser using DataLab