Stock and Watson (2007) provide several subsets created from March Current Population Surveys (CPS) with data on the relationship of earnings and education over several year.
data("CPSSW9204")
data("CPSSW9298")
data("CPSSW04")
data("CPSSW3")
data("CPSSW8")
data("CPSSWEducation")
CPSSW9298
: A data frame containing 13,501 observations on 5 variables.
CPSSW9204
: A data frame containing 15,588 observations on 5 variables.
CPSSW04
: A data frame containing 7,986 observations on 4 variables.
CPSSW3
: A data frame containing 20,999 observations on 3 variables.
CPSSW8
: A data frame containing 61,395 observations on 5 variables.
CPSSWEducation
: A data frame containing 2,950 observations on 4 variables.
factor indicating year.
average hourly earnings (sum of annual pretax wages, salaries, tips, and bonuses, divided by the number of hours worked annually).
number of years of education.
factor indicating highest educational degree ("bachelor"
or"highschool"
).
factor indicating gender.
age in years.
factor indicating region of residence
("Northeast"
, "Midwest"
, "South"
, "West"
).
Each month the Bureau of Labor Statistics in the US Department of Labor conducts the Current Population Survey (CPS), which provides data on labor force characteristics of the population, including the level of employment, unemployment, and earnings. Approximately 65,000 randomly selected US households are surveyed each month. The sample is chosen by randomly selecting addresses from a database. Details can be found in the Handbook of Labor Statistics and is described on the Bureau of Labor Statistics website (http://www.bls.gov/).
The survey conducted each March is more detailed than in other months and asks questions about earnings during the previous year. The data sets contain data for 2004 (from the March 2005 survey), and some also for earlier years (up to 1992).
If education is given, it is for full-time workers, defined as workers employed more than 35 hours per week for at least 48 weeks in the previous year. Data are provided for workers whose highest educational achievement is a high school diploma and a bachelor's degree.
Earnings for years earlier than 2004 were adjusted for inflation by putting them in 2004 USD using the Consumer Price Index (CPI). From 1992 to 2004, the price of the CPI market basket rose by 34.6%. To make earnings in 1992 and 2004 comparable, 1992 earnings are inflated by the amount of overall CPI price inflation, by multiplying 1992 earnings by 1.346 to put them into 2004 dollars.
CPSSW9204
provides the distribution of earnings in the US in 1992 and 2004
for college-educated full-time workers aged 25--34.
CPSSW04
is a subset of CPSSW9204
and provides the distribution of
earnings in the US in 2004 for college-educated full-time workers aged 25--34.
CPSSWEducation
is similar (but not a true subset) and contains the
distribution of earnings in the US in 2004 for college-educated full-time workers
aged 29--30.
CPSSW8
contains a larger sample with workers aged 21--64, additionally
providing information about the region of residence.
CPSSW9298
is similar to CPSSW9204
providing data from 1992 and 1998
(with the 1992 subsets not being exactly identical).
CPSSW3
provides trends (from 1992 to 2004) in hourly earnings in the US of
working college graduates aged 25--34 (in 2004 USD).
Stock, J.H. and Watson, M.W. (2007). Introduction to Econometrics, 2nd ed. Boston: Addison Wesley.
# NOT RUN {
data("CPSSW3")
with(CPSSW3, interaction.plot(year, gender, earnings))
## Stock and Watson, p. 165
data("CPSSWEducation")
plot(earnings ~ education, data = CPSSWEducation)
fm <- lm(earnings ~ education, data = CPSSWEducation)
coeftest(fm, vcov = sandwich)
abline(fm)
# }
Run the code above in your browser using DataLab