WebSource

feedurls

class label to be assigned to <code>Source</code> object, defaults to "WebXMLSource"

class

function to be used to read content, see also <code><a rd-options="" href="/link/readWeb?package=tm.plugin.webmining&version=1.3" data-mini-rdoc="tm.plugin.webmining::readWeb">readWeb</a></code>

reader

function to be used to split feed content into chunks, returns list of content elements

parser

specifies default encoding, defaults to 'UTF-8'

encoding

a named list or CURLOptions object identifying the curl options for the handle. Type <code>listCurlOptions()</code> for all Curl options available.

curlOpts

function saved in WebSource object and called to retrieve full text content from feed urls

postFUN

logical; Specify if feedurls should be downloaded first.

retrieveFeedURL

additional parameters passed to <code>WebSource</code> object/structure


WebSource is derived from <code><a rd-options="tm" href="/link/Source?package=tm.plugin.webmining&version=1.3&to=tm" data-mini-rdoc="tm::Source">Source</a></code>. In addition to calling the
base <code><a rd-options="tm" href="/link/Source?package=tm.plugin.webmining&version=1.3&to=tm" data-mini-rdoc="tm::Source">Source</a></code> constructor function it also retrieves the specified
feedurls and pre--parses the content with the parser function.
The fields <code>$Content</code>, <code>$Feedurls</code> <code>$Parser</code> and <code>$CurlOpts</code> are finally
added to the <code>Source</code> object.


Facilitate text retrieval from feed
formats like XML (RSS, ATOM) and JSON. Also direct retrieval from
HTML is supported. As most (news) feeds only incorporate small
fractions of the original text tm.plugin.webmining even retrieves
and extracts the text of the original text source.

WebSource: Read Web Content and respective Link Content from feedurls.

Description

Usage

Arguments

Value