Learn R Programming

repmis (version 0.5)

source_data: Load plain-text data and RData from a URL (either http or https)

Description

source_data loads plain-text or RDATA formatted data stored at a URL (both http and https) into R.

Usage

source_data(url, rdata, sha1 = NULL, cache = FALSE, clearCache = FALSE, sep = "auto", header = "auto", stringsAsFactors = FALSE, envir = parent.frame(), ...)

Arguments

url
The data's URL. To distinguish between plain-text and RDATA the url must end in a distinguishing file extension.
rdata
logical. Whether or not the data set is an .RDATA file. If not specified than source_url will attempt to determine whether or not the file is an .RDATA file from the URL's extension.
sha1
Character string of the file's SHA-1 hash, generated by source_data. Note if you are using data stored using Git, this is not the file's commit SHA-1 hash.
cache
logical. Whether or not to cache the data so that it is not downloaded every time the function is called.
clearCache
logical. Whether or not to clear the downloaded data from the cache.
sep
The separator method for the plain-text data. For example, to load comma-separated values data (CSV) use sep = ",". To load tab-separated values data (TSV) use sep = "\t". Only relevant for plain-text data.
header
Logical, whether or not the first line of the file is the header (i.e. variable names).
stringsAsFactors
logical. Convert all character columns to factors?
envir
the environment where the data should be loaded.
...
additional arguments passed to fread or load as relevant.

Value

a data frame

Source

Originally based on source_url from the Hadley Wickham's devtools package.

Details

Loads plain-text data (e.g. CSV, TSV) or RDATA from a URL. Works with both HTTP and HTTPS sites. Note: the URL you give for the url argument must be for the RAW version of the file. The function should work to download plain-text data from any secure URL (https), though I have not verified this.

From the source_url documentation: "If a SHA-1 hash is specified with the sha1 argument, then this function will check the SHA-1 hash of the downloaded file to make sure it matches the expected value, and throw an error if it does not match. If the SHA-1 hash is not specified, it will print a message displaying the hash of the downloaded file. The purpose of this is to improve security when running remotely-hosted code; if you have a hash of the file, you can be sure that it has not changed."

See Also

httr, fread, and load

Examples

Run this code
## Not run: 
# # Download electoral disproportionality data stored on GitHub
# # Note: Using shortened URL created by bitly
# DisData <- source_data("http://bit.ly/156oQ7a")
# 
# # Check to see if SHA-1 hash matches downloaded file
# DisDataHash <- source_data("http://bit.ly/Ss6zDO",
#    sha1 = "dc8110d6dff32f682bd2f2fdbacb89e37b94f95d")
# ## End(Not run)

Run the code above in your browser using DataLab