dbApply.MySQLResultSet(rs, INDEX, FUN,
begin, group.begin, new.record, end,
batchSize = 100, maxBatch = 1e5, ...,
simplify = FALSE)
dbExec
).batchSize
.FUN
.dbApply
This function is meant to handle somewhat gracefully(?) large amounts
of data from the DBMS by bringing into R manageable chunks (about
batchSize
records at a time, but not more than maxBatch
);
the idea is that the data from individual groups can be handled by R, but
not all the groups at the same time.
The implementation allows us to register R functions that get invoked
when certain fetching events occur. These include the ``begin'' event
(no records have been yet fetched), ``begin.group'' (the record just
fetched belongs to a new group), ``new record'' (every fetched record
generates this event), ``group.end'' (the record just fetched was the
last row of the current group), ``end'' (the very last record from the
result set). Awk and perl programmers will find this paradigm very
familiar (although SAP's ABAP language is closer to what we're doing).MySQL
, dbExec
, fetch
.## compute quanitiles for each network agent
con <- dbConnect(MySQL(), group="vitalAnalysis")
rs <- dbExec(con,
"select Agent, ip_addr, DATA from pseudo_data order by Agent")
out <- dbApply(rs, INDEX = "Agent",
FUN = function(x, grp) quantile(x$DATA, names=F))
Run the code above in your browser using DataLab