wb_data: Download Data from the World Bank API

Description

This function downloads the requested information using the World Bank API

Usage

wb_data(
  indicator,
  country = "countries_only",
  start_date,
  end_date,
  return_wide = TRUE,
  mrv,
  mrnev,
  cache,
  freq,
  gapfill = FALSE,
  date_as_class_date = FALSE,
  lang
)

Arguments

indicator

Character vector of indicator codes. These codes correspond to the indicator_id column from the indicators tibble of wb_cache(), wb_cachelist, or the result of running wb_indicators() directly

country

Character vector of country, region, or special value codes for the locations you want to return data for. Permissible values can be found in the countries tibble in wb_cachelist or by running wb_countries() directly. Specifically, values listed in the following fields iso3c, iso2c, country, region, admin_region, income_level and all of the region_*, admin_region_*, income_level_*, columns. As well as the following special values

"countries_only" (Default)
"regions_only"
"admin_regions_only"
"income_levels_only"
"aggregates_only"
"all"

start_date

Numeric or character. If numeric it must be in %Y form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1". This also accepts a special value of "YTD", useful for more frequently updated subannual indicators.

end_date

return_wide

Logical. If TRUE data is returned in a wide format instead of long, with a column named for each indicator_id or if the indicator argument is a named vector, the names() given to the indicator will be the column names. To necessitate this transformation, the indicator column that provides the human readable description is dropped, but provided as a column label. Default is TRUE

mrv

Numeric. The number of Most Recent Values to return. A replacement of start_date and end_date, this number represents the number of observations you which to return starting from the most recent date of collection. This may include missing values. Useful in conjuction with freq

mrnev

Numeric. The number of Most Recent Non Empty Values to return. A replacement of start_date and end_date, similar in behavior as mrv but excludes locations with missing values. Useful in conjuction with freq

cache

List of tibbles returned from wb_cache(). If omitted, wb_cachelist is used

freq

Character String. For fetching quarterly ("Q"), monthly("M") or yearly ("Y") values. Useful for querying high frequency data.

gapfill

Logical. If TRUE fills in missing values by carrying forward the last available value until the next available period (max number of periods back tracked will be limited by mrv number). Default is FALSE

date_as_class_date

Logical. If TRUE the date field is returned as class Date, useful when working with non-annual data or data at mixed resolutions. Default is FALSE available value until the next available period (max number of periods back tracked will be limited by mrv number). Default is FALSE

lang

Language in which to return the results. If lang is unspecified, english is the default. For supported languages see wb_languages(). Possible values of lang are in the iso2 column. A note of warning, not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA.

Value

a tibble of all available requested data.

Details

`obs_status` column

Indicates the observation status for location, indicator and date combination. For example "F" in the response indicates that the observation status for that data point is "forecast".

Examples

Run this code

# NOT RUN {

# gdp for all countries for all available dates
# }
# NOT RUN {
df_gdp <- wb_data("NY.GDP.MKTP.CD")
# }
# NOT RUN {
# Brazilian gdp for all available dates
# }
# NOT RUN {
df_brazil <- wb_data("NY.GDP.MKTP.CD", country = "br")
# }
# NOT RUN {
# Brazilian gdp for 2006
# }
# NOT RUN {
df_brazil_1 <- wb_data("NY.GDP.MKTP.CD", country = "brazil", start_date = 2006)
# }
# NOT RUN {
# Brazilian gdp for 2006-2010
# }
# NOT RUN {
df_brazil_2 <- wb_data("NY.GDP.MKTP.CD", country = "BRA",
                       start_date = 2006, end_date = 2010)
# }
# NOT RUN {
# Population, GDP, Unemployment Rate, Birth Rate (per 1000 people)
# }
# NOT RUN {
my_indicators <- c("SP.POP.TOTL",
                   "NY.GDP.MKTP.CD",
                   "SL.UEM.TOTL.ZS",
                   "SP.DYN.CBRT.IN")
# }
# NOT RUN {
# }
# NOT RUN {
df <- wb_data(my_indicators)
# }
# NOT RUN {
# you pass multiple country ids of different types
# Albania (iso2c), Georgia (iso3c), and Mongolia
# }
# NOT RUN {
my_countries <- c("AL", "Geo", "mongolia")
df <- wb_data(my_indicators, country = my_countries,
              start_date = 2005, end_date = 2007)
# }
# NOT RUN {
# same data as above, but in long format
# }
# NOT RUN {
df_long <- wb_data(my_indicators, country = my_countries,
                   start_date = 2005, end_date = 2007,
                   return_wide = FALSE)
# }
# NOT RUN {
# regional population totals
# regions correspond to the region column in wb_cachelist$countries
# }
# NOT RUN {
df_region <- wb_data("SP.POP.TOTL", country = "regions_only",
                     start_date = 2010, end_date = 2014)
# }
# NOT RUN {
# a specific region
# }
# NOT RUN {
df_world <- wb_data("SP.POP.TOTL", country = "world",
                    start_date = 2010, end_date = 2014)
# }
# NOT RUN {
# if the indicator is part of a named vector the name will be the column name
my_indicators <- c("pop" = "SP.POP.TOTL",
                   "gdp" = "NY.GDP.MKTP.CD",
                   "unemployment_rate" = "SL.UEM.TOTL.ZS",
                   "birth_rate" = "SP.DYN.CBRT.IN")
# }
# NOT RUN {
df_names <- wb_data(my_indicators, country = "world",
                    start_date = 2010, end_date = 2014)
# }
# NOT RUN {
# custom names are ignored if returning in long format
# }
# NOT RUN {
df_names_long <- wb_data(my_indicators, country = "world",
                         start_date = 2010, end_date = 2014,
                         return_wide = FALSE)
# }
# NOT RUN {
# same as above but in Bulgarian
# note that not all indicators have translations for all languages
# }
# NOT RUN {
df_names_long_bg <- wb_data(my_indicators, country = "world",
                            start_date = 2010, end_date = 2014,
                            return_wide = FALSE, lang = "bg")
# }

Run the code above in your browser using DataLab