Learn R Programming

Rcrawler (version 0.1.9-1)

RobotParser: RobotParser fetch and parse robots.txt

Description

This function fetch and parse robots.txt file of the website which is specified in the first argument and return the list of correspending rules .

Usage

RobotParser(website, useragent)

Arguments

website

character, url of the website which rules have to be extracted .

useragent

character, the useragent of the crawler

Value

return a list of three elements, the first is a character vector of Disallowed directories, the third is a Boolean value which is TRUE if the user agent of the crawler is blocked.

Examples

Run this code
# NOT RUN {
#RobotParser("http://www.glofile.com","AgentX")
#Return robot.txt rules and check whether AgentX is blocked or not.


# }

Run the code above in your browser using DataLab