Crunch Variables reside on the server, allowing you to work with
datasets that are too big to bring into memory on your machine. Many
functions, such as max
, mean
, and crtabs()
, translate your commands
into API queries and return only the result. But, not every operation you'll
want to perform has been implemented on the Crunch servers. If you need to do
something beyond what is currently supported, you can bring a variable's
data into R with as.vector(ds$var)
and work with it like any
other R vector.
# S4 method for CrunchVariable
as.vector(x, mode = "any")# S4 method for CrunchExpr
as.vector(x, mode = "any")
an R vector of the type corresponding to the Variable. E.g. CategoricalVariable yields type factor by default, NumericVariable yields numeric, etc.
a CrunchVariable
for Categorical variables, one of either "factor" (default,
which returns the values as factor); "numeric" (which returns the numeric
values); or "id" (which returns the category ids). If "id", values
corresponding to missing categories will return as the underlying integer
codes; i.e., the R representation will not have any NA
elements. Otherwise,
missing categories will all be returned NA
. For non-Categorical
variables, the mode
argument is ignored.
as.vector
transfers data from Crunch to a local R session. Note:
as.vector
returns the vector in the row order of the dataset. If filters
are set that specify an order that is different from the row order of the
dataset, the results will ignore that order. If you need the vector ordered
in that way, use syntax like as.vector(ds$var)[c(10, 5, 2)]
instead.
as.data.frame for another interface
for (lazily) fetching data from the server as needed; exportDataset()
for
pulling all of the data from a dataset.