Learn R Programming

tm.plugin.webmining (version 1.3)

WebSource: Read Web Content and respective Link Content from feedurls.

Description

WebSource is derived from Source. In addition to calling the base Source constructor function it also retrieves the specified feedurls and pre--parses the content with the parser function. The fields $Content, $Feedurls $Parser and $CurlOpts are finally added to the Source object.

Usage

WebSource(feedurls, class = "WebXMLSource", reader, parser, encoding = "UTF-8", curlOpts = curlOptions(followlocation = TRUE, maxconnects = 20, maxredirs = 10, timeout = 30, connecttimeout = 30), postFUN = NULL, retrieveFeedURL = TRUE, ...)

Arguments

feedurls
urls from feeds to be retrieved
class
class label to be assigned to Source object, defaults to "WebXMLSource"
reader
function to be used to read content, see also readWeb
parser
function to be used to split feed content into chunks, returns list of content elements
encoding
specifies default encoding, defaults to 'UTF-8'
curlOpts
a named list or CURLOptions object identifying the curl options for the handle. Type listCurlOptions() for all Curl options available.
postFUN
function saved in WebSource object and called to retrieve full text content from feed urls
retrieveFeedURL
logical; Specify if feedurls should be downloaded first.
...
additional parameters passed to WebSource object/structure

Value

WebSource