RCC: Functions to talk to an Rserve instance (new version)

Description

Rserve is a server providing R functionality via sockets. The following functions allow another R session to start new Rserve sessions and evaluate commands.

Usage

RS.connect(host = NULL, port = 6311L, tls = FALSE, verify = TRUE,
           proxy.target = NULL, proxy.wait = TRUE, chain, key, ca)
RS.login(rsc, user, password, pubkey, authkey)
RS.eval(rsc, x, wait = TRUE, lazy = TRUE)
RS.eval.qap(rsc, x, wait = TRUE)
RS.collect(rsc, timeout = Inf, detail = FALSE, qap = FALSE)
RS.close(rsc)
RS.assign(rsc, name, value, wait = TRUE)
RS.switch(rsc, protocol = "TLS", verify = TRUE, chain, key, ca)
RS.authkey(rsc, type = "rsa-authkey")
RS.server.eval(rsc, text)
RS.server.source(rsc, filename)
RS.server.shutdown(rsc)
RS.oobCallbacks(rsc, send, msg)

Arguments

host: host to connect to or socket path or NULL for local host
port: TCP port to connect to or 0 if unix socket is to be used
tls: if TRUE then SSL/TLS encrypted connection is started
verify: logical, if FALSE no verification of the server certificate is done, otherwise the certificate is verified and the function will fail with an error if it is not valid.
chain: string, optional, path to a file in PEM format that contains client certificate and its chain. The client certificate must be first in the chain.
key: string, optional, path to a file in PEM format containing the private key for the client certificate. If a client certificate is necessary for the connection, both chain and key must be set.
ca: string, optional, path to a file holding any additional certificate authority (CA) certificates (including intermediate certificates) in PEM format that are required for the verification of the server certificate. Only relevant if verify=TRUE.
proxy.target: proxy target (string) in the form <host>:<port> to be used when connecting to a non-transparent proxy that requires target designation. Not used when connected to transparent proxies or directly to Rserve instances. Note that literal IPv6 addresses must be quoted in [].
proxy.wait: if TRUE then the proxy will wait (indefinitely) if the target is unavailable due to too high load, if FALSE then the proxy is instructed to close the connection in such instance instead
rsc: Rserve connection as obtained from RS.connect
user: username for authentication (mandatory)
password: password for authentication
pubkey: public key for authentication
authkey: authkey (as obtained from RS.authkey) for secure authentication
x: expression to evaluate
wait: if TRUE then the result is delivered synchronously, if FALSE then NULL is returned instead and the result can be collected later with RS.collect
lazy: if TRUE then the passed expression is not evaluated locally but passed for remote evaluation (as if quoted, modulo substitution). Otherwise it is evaluated locally first and the result is passed for remote evaluation.
timeout: numeric, timeout (in seconds) to wait before giving up
detail: if TRUE then the result payload is returned in a list with elements value (unserialized result value of the command - where applicable) and rsc (connection which returned this result) which allows to identify the source of the result and to distinguish timeout from a NULL value. Otherwise the returned value is just the payload value of the result.
name: string, name of the symbol to assign to
value: value to assign -- if missing name is assumed to be a symbol and its evaluated value will be used as value while the symbol name will be used as name
protocol: protocol to switch to (string)
type: type of the authentication to perform (string)
send: callback function for OOB_SEND
msg: callback function for OOB_MSG
text: string that will be parsed and evaluated on the server side
filename: name of the file (on the server!) to source
qap: logical, if TRUE then the result is assumed to be in QAP encoding (native Rserve protocol), otherwise it is assumed to be using R serialization.

Author

Simon Urbanek

Parallel use

It is currently possible to use Rserve connections in parallel via mcparallel or mclapply if certain conditions are met. First, only clear connection (non-TLS) are eligible for parallel use and there may be no OOB commands. Then it is legal to use connections in forked process as long as both the request is sent and the result is collected in the same process while no other process uses the connection. However, connections can only be created in the parent session (except if the connection is created and subsequently closed in the child process).

One possible use is to initiate connections to a cluster and perform operations in parallel. For example:

    library(RSclient)
    library(parallel)
    ## try to connect to 50 different nodes
    ## cannot parallelize this - must be in the parent process
    c <- lapply(paste("machine", 1:50, sep=''),
                function(name) try(RS.connect(name), silent=TRUE))
    ## keep only successful connections
    c <- c[sapply(c, class) == "RserveConnection"]
    ## login to all machines in parallel (using RSA secured login)
    unlist(mclapply(c,
           function(c) RS.login(c, "user", "password",, RS.authkey(c)),
	   mc.cores=length(c)))
    ## do parallel work ...
    ## pre-load some "job" function to all nodes
    unlist(mclapply(c, function(c) RS.assign(c, job), mc.cores=length(c)))
    ## etc. etc. then call it in parallel on all nodes ...
    mclapply(c, function(c) RS.eval(c, job()), mc.cores=length(c))
    
    ## close all
    sapply(c, RS.close)

Details

RS.connect creates a connection to a Rserve. The returned handle is to be used in all subsequent calls to client functions. The session associated witht he connection is alive until closed via RS.close.

RS.close closes the Rserve connection.

RS.login performs authentication with the Rserve. The user entry is mandatory and at least one of password, pubkey and authkey must be provided. Typical secure authentication is performed with RS.login(rsc, "username", "password", authkey=RS.authkey(rsc)) which ensures that the authentication request is encrypted and cannot be spoofed. When using TLS connections RS.authkey is not necessary as the connection is already encrypted.

RS.eval evaluates the supplied expression remotely.

RS.eval.qap behaves like RS.eval(..., lazy=FALSE), but uses the Rserve QAP serialization of R objects instead of the native R serialization.

RS.collect collects results from RS.eval(..., wait = FALSE) calls. Note that in this case rsc can be either one connection or a list of connections.

RS.assign assigns a value to the remote global workspace.

RS.switch attempts to switch the protocol currently used for communication with Rserve. Currently the only supported protocol switch is from plain QAP1 to TLS secured (encrypted) QAP1.

RS.oobCallbacks sets or retrieves the callback functions associated with OOB_SEND and OOB_MSG out-of-band commands. If neither send nor msg is specified then RS.oobCallbacks simply returns the current callback functions, otherwise it replaces the existing ones. Both functions have the form function(code, payload) where code is the OOB sub-code (scalar integer) and payload is the content passed in the OOB command. For OOB_SEND the result of the callback is disarded, for OOB_MSG the result is encoded and sent back to the server. Note that OOB commands in this client are only processed when waiting for the response to another command (typically RS.eval). OOB commands must be explicitly enabled in the server in order to be used (they are disabled by default).

RS.server.eval, RS.server.source and RS.server.shutdown are `control commands' which are enqueued to be processed by the server asynchronously. They return TRUE on success which means the command was enqueued - it does not mean that the server has processed the command. All control commands affect only future connections, they do NOT affect any already established client connection (including the curretn one). RS.server.eval parses and evaluates the given code in the server instance, RS.server.source sources the given file in the server (the path is interpreted by the server, it is not the local path of the client!) and RS.server.shutdown attempts a clean shutdown of the server. Note that control commands are disabled by default and must be enabled in Rserve either in the configuration file with control enable or on the command line with --RS-enable-control (the latter only works with Rserve 1.7 and higher). If Rserve is configured with authentication enabled then only admin users can issues control commands (see Rserve documentation for details).

Examples

Run this code

if (FALSE) {
  c <- RS.connect()
  RS.eval(c, data(stackloss))
  RS.eval(c, library(MASS))
  RS.eval(c, rlm(stack.loss ~ ., stackloss)$coeff)
  RS.eval(c, getwd())
  x <- rnorm(1e5)
  ## this sends the contents of x to the remote side and runs `sum` on
  ## it without actually creating the binding x on the remote side
  RS.eval(c, as.call(list(quote(sum), x)), lazy=FALSE)
  RS.close(c)
  }

Run the code above in your browser using DataLab