Data frame consisting of the following variables:
Data is aggregated into half seasons: so season indicates whether the observation is in the first or second half of the season of a given year. Only players who have more than 10 at bats in any half season are included, and only players who have more than three half seasons are represented. The transformed batting average is \(arcsin(sqrt((H + 1/4)/(AB + 1/2)))\). Only regular seasons data are included. R programs to extract the data from the original sources are available on request.
Name
IdNum
Year
Halfseason
Pitcher
HA transformed batting average;
AB at bats
H hits
BB walks
YOB Year of Birth;
age age of the player
agesq age squared
Gu, Jiaying and Roger Koenker (2015) Empirical Bayesball Remixed: Empirical Bayes Methods for Longitudinal Data, J. Applied Econometrics, forthcoming.