Learn R Programming

UsingR (version 2.0-7)

batting: Batting statistics for 2002 baseball season

Description

This dataset contains batting statistics for the 2002 baseball season. The data allows you to compute batting averages, on base percentages, and other statistics of interest to baseball fans. The data only contains players with more than 100 atbats for a team in the year. The data is excerpted with permission from the Lahman baseball database at http://www.seanlahman.com/.

Usage

data(batting)

Arguments

Format

A data frame with 438 observations on the following 22 variables.

playerID

This is coded, but those familiar with the players should be able to find their favorites.

yearID

a numeric vector. Always 2002 in this dataset.

stintID

a numeric vector. Player's stint (order of appearances within a season)

teamID

a factor with Team

lgID

a factor with levels AL NL

G

number of games played

AB

number of at bats

R

number of runs

H

number of hits

DOUBLE

number of doubles. "2B" in original dat a base.

TRIPLE

number of triples. "3B" in original data base

HR

number of home runs

RBI

number of runs batted in

SB

number of stolen bases

CS

number of times caught stealing

BB

number of base on balls (walks)

SO

number of strikeouts

IBB

number of intentional walks

HBP

number of hit by pitches

SH

number of sacrifice hits

SF

number of sacrifice flies

GIDP

number of grounded into double plays

Details

Baseball fans are “statistics” crazy. They love to talk about things like RBIs, BAs and OBPs. In order to do so, they need the numbers. This data comes from the Lahman baseball database at http://www.seanlahman.com/. The complete dataset includes data for all of baseball not just the year 2002 presented here.

References

In addition to the data set above, the book Curve Ball, by Albert, J. and Bennett, J., Copernicus Books, gives an extensive statistical analysis of baseball.

See https://www.baseball-almanac.com/stats.shtml for definitions of common baseball statistics.

Examples

Run this code
# NOT RUN {
data(batting)
attach(batting)
BA = H/AB			# batting average
OBP = (H + BB + HBP) / (AB + BB + HBP + SF) # On base "percentage"
# }

Run the code above in your browser using DataLab