Learn R Programming

Lahman (version 12.0-0)

Lahman-package: Sean Lahman's Baseball Database

Description

This database contains pitching, hitting, and fielding statistics for Major League Baseball from 1871 through 2023. It includes data from the two current leagues (American and National), the four other "major" leagues (American Association, Union Association, Players League, and Federal League), and the National Association of 1871-1875.

This database was created by Sean Lahman, who pioneered the effort to make baseball statistics freely available to the general public. What started as a one man effort in 1994 has grown tremendously, and now a team of researchers have collected their efforts to make this the largest and most accurate source for baseball statistics available anywhere.

This database, in the form of an R package offers a variety of interesting challenges and opportunities for data processing and visualization in R.

In the current version, the examples make extensive use of the dplyr package for data manipulation (tabulation, queries, summaries, merging, etc.), reflecting the original relational database design and ggplot2 for graphics.

Arguments

Author

Michael Friendly, Dennis Murphy, Chris Dalzell, Martin Monkman

Maintainer: Chris Dalzell <cdalzell@gmail.com>

Details

Package:Lahman
Type:Package
Version:12.0-0
Date:2024-08-24
License:GPL version 2 or newer
LazyLoad:yes
LazyData:yes

The main form of this database is a relational database in Microsoft Access format. The design follows these general principles: Each player is assigned a unique code (playerID). All of the information in different tables relating to that player is tagged with his playerID. The playerIDs are linked to names and birthdates in the People table. Similar links exist among other tables via analogous *ID variables.

The database is composed of the following main tables:

People

Player names, dates of birth, death and other biographical info

Batting

batting statistics

Pitching

pitching statistics

Fielding

fielding statistics

% \item{\code{\link{Teams}}}{yearly team statistics and standings}

A collection of other tables is also provided:

Teams:

Teamsyearly stats and standings
TeamsHalfsplit season data for teams
TeamsFranchisesfranchise information

Post-season play:

BattingPostpost-season batting statistics
PitchingPostpost-season pitching statistics
FieldingPostpost-season fielding data
SeriesPostpost-season series information

Awards:

AwardsManagersawards won by managers
AwardsPlayersawards won by players
AwardsShareManagersaward voting for manager awards
AwardsSharePlayersaward voting for player awards

Hall of Fame: links to People via hofID

HallOfFameHall of Fame voting data

Other tables:

AllstarFull - All-Star games appearances; Managers - managerial statistics; FieldingOF - outfield position data; ManagersHalf - split season data for managers; Salaries - player salary data; Appearances - data on player appearances; Schools - Information on schools players attended; CollegePlaying - Information on schools players attended, by player and year;

Variable label tables are provided for some of the tables:

battingLabels, pitchingLabels, fieldingLabels