UrlData-class: UrlData -- unified access to WWW resources
Description
This class provides the infrastructure to scrape the web
with a Extract, Transform, Load (ETL) approach.Details
The slots template, map.lst, map.fct are used to
map resources to URL addresses. The extract.fct
slot downloads the data, the transform.fct slot
transforms it. Using the scrape mechanism
inherited from Xdata it is possible to store the data in
a local database. The slot scrape.lst serves to
defines the resources and storage parameters.
In most cases, it is not necessary to subclass
UrlData. The slots can be set by the
urldata function and allow to customize each step
of the process.