Learn R Programming

corpora (version 0.6)

simulated.census: Simulated census data for examples and illustrations (corpora)

Description

This function generates a large simulated census data frame with body measurements (height, weight, shoe size) for male and female inhabitants of a highly fictitious country.

The generated data set is usually named FakeCensus (see code examples below) and is used for various exercises and illustrations in the SIGIL course.

Usage

simulated.census(N=502202, p.male=0.55, seed.rng=42)

Value

A data frame with N rows corresponding to inhabitants and the following columns:

height:

body height in cm

height:

body weight in kg

shoe.size:

shoe size in Paris points (Continental European scale)

sex:

sex, either m or f

Arguments

N

population size, i.e. number of inhabitants of the fictitious country

p.male

proportion of males in the country

seed.rng

seed for the random number generator, so data sets with the same parameters (N, p.male, etc.) are reproducible

Author

Stephanie Evert (https://purl.org/stephanie.evert)

Details

The default population size corresponds to the estimated populace of Luxembourg on 1 January 2010 (according to https://en.wikipedia.org/wiki/Luxembourg).

Further parameters of the simulation (standard deviation, correlations, non-linearity) will be exposed as function arguments in future releases.

Examples

Run this code

FakeCensus <- simulated.census()
summary(FakeCensus)

# \dontshow{
  # some consistency checks
  stopifnot(nrow(FakeCensus) == 502202) 
  stopifnot(! any(is.na(FakeCensus$height) | is.na(FakeCensus$weight) | is.na(FakeCensus$shoe.size)) )
  stopifnot(abs(mean(FakeCensus$height[FakeCensus$sex == "m"]) - 180) < 0.1) 
# }

Run the code above in your browser using DataLab