Learn R Programming

cbsodataR (version 1.1)

get_data-deprecated: Get data from Statistics Netherlands (CBS)

Description

This method is deprecated in favor of cbs_get_data()

Usage

get_data(
  id,
  ...,
  recode = TRUE,
  use_column_title = recode,
  dir = tempdir(),
  base_url = getOption("cbsodataR.base_url", BASE_URL)
)

Value

data.frame with the requested data. Note that a csv copy of the data is stored in dir.

Arguments

id

Identifier of table, can be found in cbs_get_datasets()

...

optional filter statements, see details.

recode

recodes all codes in the code columns with their Title as found in the metadata

use_column_title

not used.

dir

Directory where the table should be downloaded. Defaults to temporary directory

base_url

optionally specify a different server. Useful for third party data services implementing the same protocol.

Copyright use

The content of CBS opendata is subject to Creative Commons Attribution (CC BY 4.0). This means that the re-use of the content is permitted, provided Statistics Netherlands is cited as the source. For more information see: https://www.cbs.nl/en-gb/about-us/website/copyright

Details

To reduce the download time, optionaly the data can be filtered on category values: for large tables (> 100k records) this is a wise thing to do.

The filter is specified with (see examples below):

  • <column_name> = <values> in which <values> is a character vector. Rows with values that are not part of the character vector are not returned. Note that the values have to be values from the $Key column of the corresponding meta data. These may contain trailing spaces...

  • <column_name> = has_substring(x) in which x is a character vector. Rows with values that do not have a substring that is in x are not returned. Useful substrings are "JJ", "KW", "MM" for Periods (years, quarters, months) and "PV", "CR" and "GM" for Regions (provinces, corops, municipalities).

  • <column_name> = eq(<values>) | has_substring(x), which combines the two statements above.

By default the columns will be converted to their type (typed=TRUE). CBS uses multiple types of missing (unknown, surpressed, not measured, missing): users wanting all these nuances can use typed=FALSE which results in character columns.

See Also

cbs_get_meta(), cbs_download_data()

Other data retrieval: cbs_add_date_column(), cbs_add_label_columns(), cbs_add_unit_column(), cbs_download_data(), cbs_extract_table_id(), cbs_get_data_from_link()

Other query: eq(), has_substring()

Examples

Run this code
if (FALSE) {
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = "2000MM03"     # March 2000
            , CPI     = "000000"       # Category code for total 
            )

# useful substrings:
## Periods: "JJ": years, "KW": quarters, "MM", months
## Regions: "NL", "PV": provinces, "GM": municipalities
  
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = has_substring("JJ")     # all years
            , CPI     = "000000"       # Category code for total 
            )

cbs_get_data( id      = "7196ENG"      # table id
            , Periods = c("2000MM03","2001MM12")     # March 2000 and Dec 2001
            , CPI     = "000000"       # Category code for total 
            )

# combine either this
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = has_substring("JJ") | "2000MM01" # all years and Jan 2001
            , CPI     = "000000"       # Category code for total 
            )

# or this: note the "eq" function
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = eq("2000MM01") | has_substring("JJ") # Jan 2000 and all years
            , CPI     = "000000"       # Category code for total 
            )
}

Run the code above in your browser using DataLab