Learn R Programming

fastR2 (version 1.2.4)

AirlineArrival: Airline On-Time Arrival Data

Description

Flights categorized by destination city, airline, and whether or not the flight was on time.

Arguments

Format

A data frame with 11000 observations on the following 3 variables.

airport

a factor with levels LosAngeles, Phoenix, SanDiego, SanFrancisco, Seattle

result

a factor with levels Delayed, OnTime

airline

a factor with levels Alaska, AmericaWest

References

These and similar data appear in many text books under the topic of Simpson's paradox.

Examples

Run this code

tally(
  airline ~ result, data = AirlineArrival, 
  format = "perc", margins = TRUE)
tally(
  result ~ airline + airport, 
  data = AirlineArrival, format = "perc", margins = TRUE)
AirlineArrival2 <- 
  AirlineArrival %>% 
  group_by(airport, airline, result) %>% 
  summarise(count = n()) %>%
  group_by(airport, airline) %>%
  mutate(total = sum(count), percent = count/total * 100) %>% 
  filter(result == "Delayed") 
AirlineArrival3 <- 
  AirlineArrival %>% 
  group_by(airline, result) %>% 
  summarise(count = n()) %>%
  group_by(airline) %>%
  mutate(total = sum(count), percent = count/total * 100) %>% 
  filter(result == "Delayed") 
  gf_line(percent ~ airport, color = ~ airline, group = ~ airline, 
          data = AirlineArrival2) %>%
    gf_point(percent ~ airport, color = ~ airline, size = ~total, 
             data = AirlineArrival2) %>%
    gf_hline(yintercept = ~ percent, color = ~airline, 
             data = AirlineArrival3, linetype = "dashed") %>%
    gf_labs(y = "percent delayed") 

Run the code above in your browser using DataLab