Learn R Programming

collapse (version 1.3.1)

ftransform: Fast Transform and Compute Columns on a Data Frame

Description

ftransform is a much faster update of base::transform for data frames. It returns the data frame with new columns computed and/or existing columns modified or deleted. settransform does all of that by reference i.e. it modifies the data frame in the global environment. fcompute can be used to compute new columns from the columns in a data frame and returns only the computed columns.

Usage

# Modify and return 'data.frame'
ftransform(X, …)
tfm(X, …)            # Shortcut for ftransform

# Modify 'data.frame' by reference settransform(X, …) settfm(X, …) # Shortcut for settransform

# Replace modified columns in a 'data.frame' ftransform(X) <- value tfm(X) <- value # Shortcut for ftransform<-

# Compute and return new 'data.frame' from existing one fcompute(X, …)

Arguments

X

a data frame or named list of columns.

further arguments of the form column = value. The value can be a combination of other columns, a scalar value, or NULL, which deletes column. Alternatively it is also possible to place a single list here, which will be treated the same as a list of column = value arguments.

value

a named list of replacements, it will be treated like an evaluated list of column = value arguments.

Value

The modified data frame X, or, for fcompute, a new data frame with the columns computed on X. All attributes of X are preserved.

Details

The arguments to ftransform are tagged vector expressions, which are evaluated in the data frame X. The tags are matched against names(X), and for those that match, the values replace the corresponding variable in X, whereas the others are appended to X. It is also possible to delete columns by assigning NULL to them, i.e. ftransform(data, colk = NULL) removes colk from the data.

Since collapse v1.3.0, is is also possible to pass a single list to , i.e. ftransform(data, newdata) or ftransform(data, fmean(list(col1mean = col1, col2mean = col2), drop = FALSE)) etc. This list will be treated the same as a list of tagged vector expressions. See Examples.

The function settransform does all of that by reference, but uses base-R's copy-on modify semantics, which is equivalent to replacing the data with <- (thus it is still memory efficient but the data will have a different memory address after each call of settransform).

Finally, the function fcompute functions just like ftransform, but returns only the changed / computed columns without modifying or appending the data in X.

See Also

with, within, Data Frame Manipulation, Collapse Overview

Examples

Run this code
# NOT RUN {
## ftransform modifies and returns a data.frame
head(ftransform(airquality, Ozone = -Ozone))
head(ftransform(airquality, new = -Ozone, Temp = (Temp-32)/1.8))
head(ftransform(airquality, new = -Ozone, new2 = 1, Temp = NULL))  # Deleting Temp
head(ftransform(airquality, Ozone = NULL, Temp = NULL))            # Deleting columns

# This computes the median and standard-deviation of Ozone in each month
head(ftransform(airquality,
                   Ozone_Month_median = fmedian(Ozone, Month, TRA = "replace_fill"),
                   Ozone_Month_sd = fsd(Ozone, Month, TRA = "replace_fill")))

# Grouping by month and above/below average temperature in each month
head(ftransform(airquality, Ozone_Month_high_median =
     fmedian(Ozone, list(Month, Temp > fbetween(Temp, Month)), TRA = "replace_fill")))

## Since v1.3.0 one can pass a list of columns, and there is a replacement method
head(ftransform(airquality, STD(airquality, cols = 1:3)))  # Could use magrittr::`%<>%`
ftransform(airquality) <- fscale(get_vars(airquality, 1:3))
rm(airquality)

# This feature also allows to flexibly do grouped operations creating multiple new columns
head(ftransform(airquality,
   fmedian(list(Wind_Month_median = Wind,
                Ozone_Month_median = Ozone), Month, TRA = "replace_fill")))

# This performs 2 different multi-column grouped operations (need c() to make it one list)
head(ftransform(airquality, c(fmedian(list(Wind_Day_median = Wind,
                                      Ozone_Day_median = Ozone), Day, TRA = "replace_fill"),
                         fsd(list(Wind_Month_sd = Wind,
                                  Ozone_Month_sd = Ozone), Month, TRA = "replace_fill"))))

## settransform works like ftransform but modifies a data frame in the global environment..
airquality_c <- airquality
settransform(airquality_c, Ratio = Ozone / Temp, Ozone = NULL, Temp = NULL)
settransform(airquality_c, STD(airquality_c, cols = 1:2))    # This also works..
head(airquality_c)
rm(airquality_c)

## fcompute only returns the modified / computed columns
head(fcompute(airquality, Ozone = -Ozone))
head(fcompute(airquality, new = -Ozone, Temp = (Temp-32)/1.8))
head(fcompute(airquality, new = -Ozone, new2 = 1))

# }

Run the code above in your browser using DataLab