mysqlDBApply(res, INDEX, FUN = stop("must specify FUN"),
begin = NULL,
group.begin = NULL,
new.record = NULL,
end = NULL,
batchSize = 100, maxBatch = 1e6,
..., simplify = TRUE)
dbSendQuery
).batchSize
.FUN
.dbApply
This function is meant to handle somewhat gracefully(?) large amounts
of data from the DBMS by bringing into R manageable chunks (about
batchSize
records at a time, but not more than maxBatch
);
the idea is that the data from individual groups can be handled by R, but
not all the groups at the same time.
The MySQL implementation mysqlDBApply
allows us to register R
functions that get invoked
when certain fetching events occur. These include the ``begin'' event
(no records have been yet fetched), ``begin.group'' (the record just
fetched belongs to a new group), ``new record'' (every fetched record
generates this event), ``group.end'' (the record just fetched was the
last row of the current group), ``end'' (the very last record from the
result set). Awk and perl programmers will find this paradigm very
familiar (although SAP's ABAP language is closer to what we're doing).MySQL
, dbSendQuery
, fetch
.## compute quanitiles for each network agent
con <- dbConnect(MySQL(), group="vitalAnalysis")
res <- dbSendQuery(con,
"select Agent, ip_addr, DATA from pseudo_data order by Agent")
out <- dbApply(res, INDEX = "Agent",
FUN = function(x, grp) quantile(x$DATA, names=FALSE))
Run the code above in your browser using DataLab