Learn R Programming

correlation (version 0.6.1)

simulate_simpson: Simpson's paradox dataset simulation

Description

Simpson's paradox, or the Yule-Simpson effect, is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined.

Usage

simulate_simpson(n = 100, r = 0.5, groups = 3, difference = 1)

Arguments

n

The number of observations for each group to be generated.

r

A value or vector corresponding to the desired correlation coefficients.

groups

Number of groups.

difference

Difference between groups.

Value

A dataset.

Examples

Run this code
# NOT RUN {
data <- simulate_simpson(n = 100, groups = 5, r = 0.5)

library(ggplot2)
ggplot(data, aes(x = V1, y = V2)) +
  geom_point(aes(color = Group)) +
  geom_smooth(aes(color = Group), method = "lm") +
  geom_smooth(method = "lm")
# }

Run the code above in your browser using DataLab