zi_aggregate: Aggregate ZCTAs to Three-digit ZCTAs

Description

This function takes input ZCTA data and aggregates it to three-digit areas, which are considerably larger. These regions are sometimes used in American health care contexts for publishing geographic identifiers.

Usage

zi_aggregate(.data, year, extensive = NULL, intensive = NULL,
    intensive_method = "mean", survey, output = "tidy", zcta = NULL,
    key = NULL)

Value

A tibble containing all aggregated data requested in either

"tidy" or "wide" format.

Arguments

.data: A tidy set of demographic data containing one or more variables that should be aggregated to three-digit ZCTAs. This data frame or tibble should contain all five-digit ZCTAs within the three digit ZCTAs that you plan to use for aggregating data. See Details below for formatting requirements.
year: A four-digit numeric scalar for year. zippeR currently supports data for from 2010 to 2022. Different survey products are available for different years. See the survey parameter for more details.
extensive: A character scalar or vector listing all extensive (i.e. count data) variables you wish to aggregate. These will be summed. For American Community Survey data, the margin of error will be calculated by taking the square root of the summed, squared margins of error for each five-digit ZCTA within a given three-digit ZCTA.
intensive: A character scalar or vector listing all intensive (i.e. ratio, percent, or median data) variables you wish to aggregate. These will be combined using the approach listed for intensive_method.
intensive_method: A character scalar; either "mean" (default) or "median". In either case, a weighted approach is used where total population for each five-digit ZCTA is used to calculate individual ZCTAs' weights. For American Community Survey Data, this is applied to the margin of error as well.
survey: A character scalar representing the Census product. It can be either a Decennial Census product (either "sf1" or "sf3") or an American Community Survey product (either "acs1", "acs3", or "acs5"). For Decennial Census calls, only the 2010 Census is available. In addition, if a variable cannot be found in "sf1", the function will look in "sf3". Also note that "acs3" was discontinued after 2013.
output: A character scalar; one of "tidy" (long output) or "wide" depending on the type of data format you want. If you are planning to join these data with geometric data, "wide" is the strongly encouraged format.
zcta: An optional vector of ZCTAs that demographic data are requested for. If this is NULL, data will be returned for all ZCTAs. If a vector is supplied, only data for those requested ZCTAs will be returned. The vector can be created with zi_get_geometry(). If style = "zcta5", this vector should be made up of five-digit GEOID values. If style = "zcta3", this vector should be made up of three-digital ZCTA3 values.
key: A Census API key, which can be obtained at https://api.census.gov/data/key_signup.html. This can be omitted if tidycensus::census_api_key() has been used to write your key to your .Renviron file. You can check whether an API key has been written to .Renviron by using Sys.getenv("CENSUS_API_KEY").

Examples

Run this code

# load sample demographic data
mo22_demos <- zi_mo_pop

  # the above data can be replicated with the following code:
  # zi_get_demographics(year = 2022, variables = c("B01003_001", "B19013_001"),
  #   survey = "acs5")

# load sample geometric data
mo22_zcta3 <- zi_mo_zcta3

  # the above data can be replicated with the following code:
  # zi_get_geometry(year = 2022, style = "zcta3", state = "MO",
  #   method = "intersect")

# aggregate a single variable
zi_aggregate(mo22_demos, year = 2020, extensive = "B01003_001", survey = "acs5",
  zcta = mo22_zcta3$ZCTA3)

# \donttest{
# aggregate multiple variables, outputting wide data
zi_aggregate(mo22_demos, year = 2020,
  extensive = "B01003_001", intensive = "B19013_001", survey = "acs5",
  zcta = mo22_zcta3$ZCTA3, output = "wide")
# }

Run the code above in your browser using DataLab