Retrieve the headers for a URL for a supported protocol such as
http://
, ftp://
, https://
and ftps://
.
An optional function not supported on all platforms.
curlGetHeaders(url, redirect = TRUE, verify = TRUE)
character string specifying the URL.
logical: should redirections be followed?
logical: should certificates be verified as valid and applying to that host?
A character vector with integer attribute "status"
(the
last-received ‘status’ code). If redirection occurs this will include
the headers for all the URLs visited.
For the interpretation of ‘status’ codes see https://en.wikipedia.org/wiki/List_of_HTTP_status_codes and https://en.wikipedia.org/wiki/List_of_FTP_server_return_codes. A successful FTP connection will usually have status 250 or 350.
This reports what curl -I -L
or curl -I
would
report. For a ftp://
URL the ‘headers’ are a record of
the conversation between client and server before data transfer.
Only 500 header lines will be reported: there is a limit of 20 redirections so this should suffice (and even 20 would indicate problems).
It uses getOption("timeout")
for the connection
timeout: that defaults to 60 seconds. As this cannot be interrupted
you may want to consider a shorter value.
To see all the details of the interaction with the server(s) set
options(internet.info = 1)
.
HTTP[S] servers are allowed to refuse requests to read the headers and
some do: this will result in a status
of 405
.
For possible issues with secure URLs (especially on Windows) see
download.file
.
There is a security risk in not verifying certificates, but as only the headers are captured it is slight. Usually looking at the URL in a browser will reveal what the problem is (and it may well be machine-specific).
capabilities("libcurl")
to see if this is supported.
options
HTTPUserAgent
and timeout
are used.
## a not-always-available site:
curlGetHeaders("ftps://test.rebex.net/readme.txt")
Run the code above in your browser using DataLab