Learn R Programming

GDELTtools (version 1.2)

GetGDELT: Download and subset GDELT data

Description

Download the GDELT files necessary for a data set, import them, filter on various crieteria, and return a data.frame.

Usage

GetGDELT(start.date, end.date = start.date, filter,
  local.folder = tempdir(), max.local.mb = Inf, allow.wildcards = FALSE,
  use.regex = FALSE, data.url.root = "http://data.gdeltproject.org/events/",
  verbose = TRUE)

Arguments

start.date

character, just about any human-readable form of the earliest date to include.

end.date

character, just about any human-readable form of the latest date to include.

filter

list, named list encoding the values to include for specified fields. See Details.

local.folder

character, if specified, where downloaded files will be saved.

max.local.mb

numeric, the maximum size in MB of the downloaded files that will be retained.

allow.wildcards

logical, must be TRUE to use * in filter to specify 'any character(s)'.

use.regex

logical, if TRUE then filter will be processed as a regular expression.

data.url.root

character, URL for the folder with GDELT data files.

verbose

logical, if TRUE then indications of progress will be displayed.

Value

data.frame

Filtering Results

This is how you write the filter.

Details

If local.folder is not specified then downloaded files are stored in tempdir(). If a needed file has already been downloaded to local.folder then this file is used instead of being downloaded. This can greatly speed up future

Dates are parsed with dateParse in the TimeWarp package. Years must be given with four digits.

References

GDELT: Global Data on Events, Location and Tone, 1979-2012. Presented at the 2013 meeting of the International Studies Association in San Francisco, CA. http://www.gdeltproject.org/

Examples

Run this code
# NOT RUN {
test.filter <- list(ActionGeo_ADM1Code=c("NI", "US"), ActionGeo_CountryCode="US")
test.results <- GetGDELT(start.date="1979-01-01", end.date="1979-12-31",
  filter=test.filter)
table(test.results$ActionGeo_ADM1Code)
table(test.results$ActionGeo_CountryCode)
# }
# NOT RUN {
# Specify a local folder to store the downloaded files
# }
# NOT RUN {
test.results <- GetGDELT(start.date="1979-01-01", end.date="1979-12-31",
                         filter=test.filter,
                         local.folder="~/gdeltdata",
                         max.local.mb=500)
# }

Run the code above in your browser using DataLab