The get_estimates()
function requests data from the US Census Bureau's Population Estimates Program (PEP) datasets. The PEP datasets are defined by the US Census Bureau as follows: "The Census Bureau's Population Estimates Program (PEP) produces estimates of the population for the United States, its states, counties, cities, and towns, as well as for the Commonwealth of Puerto Rico and its municipios. Demographic components of population change (births, deaths, and migration) are produced at the national, state, and county levels of geography. Additionally, housing unit estimates are produced for the nation, states, and counties. PEP annually utilizes current data on births, deaths, and migration to calculate population change since the most recent decennial census and produce a time series of estimates of population, demographic components of change, and housing units. The annual time series of estimates begins with the most recent decennial census data and extends to the vintage year. As each vintage of estimates includes all years since the most recent decennial census, the latest vintage of data available supersedes all previously-produced estimates for those dates."
get_estimates(
geography = c("us", "region", "division", "state", "county", "county subdivision",
"place/balance (or part)", "place", "consolidated city", "place (or part)",
"metropolitan statistical area/micropolitan statistical area", "cbsa",
"metropolitan division", "combined statistical area"),
product = NULL,
variables = NULL,
breakdown = NULL,
breakdown_labels = FALSE,
vintage = 2022,
year = vintage,
state = NULL,
county = NULL,
time_series = FALSE,
output = "tidy",
geometry = FALSE,
keep_geo_vars = FALSE,
shift_geo = FALSE,
key = NULL,
show_call = FALSE,
...
)
A tibble, or sf tibble, of population estimates data
The geography of your data. Available geographies for the most recent data vintage are listed
here. "cbsa"
may
be used an alias for "metropolitan statistical area/micropolitan statistical area"
.
The data product (optional). "population"
, "components"
"housing"
, and "characteristics"
are supported.
For 2020 and later, the only supported product is "characteristics"
.
A character string or vector of character strings of requested variables. For years 2020 and later, use variables = "all"
to request all available variables.
The population breakdown used when product = "characteristics"
.
Acceptable values are "AGEGROUP"
, "RACE"
, "SEX"
, and
"HISP"
, for Hispanic/Not Hispanic. These values can be combined in
a vector, returning population estimates in the value
column for all
combinations of these breakdowns. For years 2020 and later, "AGE"
is also available for single-year age when using geography = "state"
.
Whether or not to label breakdown elements returned when
product = "characteristics"
. Defaults to FALSE.
It is recommended to use the most recent vintage available for a given decennial series (so, year = 2019 for the 2010s, and year = 2023 for the 2020s). Will default to 2022 until the full PEP for 2023 is released.
The data year (defaults to the vintage requested). Use time_series = TRUE
to access time-series estimates.
The state for which you are requesting data. State names, postal codes, and FIPS codes are accepted. Defaults to NULL.
The county for which you are requesting data. County names and FIPS codes are accepted. Must be combined with a value supplied to `state`. Defaults to NULL.
If TRUE
, the function will return a time series of observations back to the decennial Census
of 2010. The returned column is either "DATE", representing a particular estimate date, or "PERIOD",
representing a time period (e.g. births between 2016 and 2017), and contains integers representing
those values. Integer to date or period mapping is available at
https://www.census.gov/data/developers/data-sets/popest-popproj/popest/popest-vars/2019.html.
One of "tidy" (the default) in which each row represents an enumeration unit-variable combination, or "wide" in which each row represents an enumeration unit and the variables are in the columns.
if FALSE (the default), return a regular tibble of ACS data. if TRUE, uses the tigris package to return an sf tibble with simple feature geometry in the `geometry` column.
if TRUE, keeps all the variables from the Census shapefile obtained by tigris. Defaults to FALSE.
(deprecated) if TRUE, returns geometry with Alaska and Hawaii shifted for thematic
mapping of the entire US. As of May 2021, we recommend using tigris::shift_geometry()
instead.
Your Census API key.
Obtain one at https://api.census.gov/data/key_signup.html. Can be stored
in your .Renviron with census_api_key("YOUR KEY", install = TRUE)
if TRUE, display call made to Census API. This can be very useful in debugging and determining if error messages returned are due to tidycensus or the Census API. Copy to the API call into a browser and see what is returned by the API directly. Defaults to FALSE.
other keyword arguments
get_estimates()
requests data from the Population Estimates API for years 2019 and earlier; however the Population Estimates are no longer supported on the API as of 2020. For recent years, get_estimates()
reads a flat file from the Census website and parses it. This means that arguments and output for 2020 and later datasets may differ slightly from datasets acquired for 2019 and earlier.
As of April 2022, variables available for 2020 and later datasets are as follows: ESTIMATESBASE, POPESTIMATE, NPOPCHG, BIRTHS, DEATHS, NATURALCHG, INTERNATIONALMIG, DOMESTICMIG, NETMIG, RESIDUAL, GQESTIMATESBASE, GQESTIMATES, RBIRTH, RDEATH, RNATURALCHG, RINTERNATIONALMIG, RDOMESTICMIG, and RNETMIG.