Learn R Programming

SciencesPo (version 1.3.9)

galton: Galton's Family Data on Human Stature.

Description

It is a reproduction of the data set used by Galton in his 1885's paper on correlation between parent's height and their children. However, Galton would only introduce the concept of correlation few years later, in 1888. Galton suggested the use of the regression line and was the first to describe the so-called common phenomenon of regression toward the mean by comparing his experiments on the size of the seeds of successive generations of peas. This dataset contains the following columns:

  • parent the parents' average height
  • child the child's height

Usage

data(galton)

Arguments

encoding

UTF-8

format

A data.frame object with ncol(SciencesPo::galton) variables and nrow(SciencesPo::galton) observations.

Details

Regression analysis is the statistical method most often used in political science research. The reason is that most scholars are interested in identifying causal effects from non-experimental data and that regression is the method for doing this. The term regresssion (1889) was first crafted by Sir Francis Galton upon investigating the relationship between body size of fathers and sons. Thereby he invented regression analysis by estimating: $S_s = 85.7 + 0.56S_F$ meaning that the size of the son regresses towards the mean.

References

Francis Galton (1886) Regression Towards Mediocrity in Hereditary Stature. The Journal of the Anthropological Institute of Great Britain and Ireland, Vol. 15, pp. 246--263.