smooth.construct.dt.smooth.spec: Domain Transformation basis constructor

Description

The dt basis allows for any of the standard mgcv (or user-defined) bases to be aplied to a transformed version of the original terms. Smooths may be of any number of terms. Transformations are specified by supplying a function of any or all of the original terms. "by" variables are not transformed.

Usage

# S3 method for dt.smooth.spec
smooth.construct(object, data, knots)

Value

An object of class "dt.smooth". This will contain all the elements associated with the smooth.construct object from the inner smooth (defined by xt$bs), in addition to an xt

element used by the Predict.matrix method.

Arguments

object: a smooth specification object, generated by s(), te(), ti(), or t2(), with bs="dt"
data: a list containing just the data (including any by variable) required by this term, with names corresponding to object$term (and object$by). The by variable is the last element.
knots: a list containing any knots supplied for basis setup - in same order and with same names as data. Can be NULL.

Transformation Functions

Let nterms = length(object$term). The tf element can take one of the following forms:

a function of nargs arguments, where nargs <= nterms. If nterms > 1, it is assumed that this function will be applied to the first term of object$term. If all argument names of the function are term names, then those arguments will correspond to those terms; otherwise, they will correspond to the first nargs terms in object$term.
a character string corresponding to one of the built-in transformations (listed below).
A list of length ntfuncs, where ntfuncs<=nterms, containing either the functions or character strings described above. If this list is named with term names, then the transformation functions will be applied to those terms; otherwise, they will be applied to the first ntfuncs terms in object$term.

The following character strings are recognized as built-in transformations:

"log": log transformation (univariate)
"ecdf": empirical cumulative distribution function (univariate)
"linear01": linearly rescale from 0 to 1 (univariate)
"s-t": first term ("s") minus the second term ("t") (bivariate)
"s/t": first term ("s") divided by the second term ("t") (bivariate)
"QTransform": performs a time-specific ecdf transformation for a bivariate smooth, where time is indicated by the first term, and $x$ by the second term. Primarily for use with refund::af.

Some transformations rely on a fixed "pivot point" based on the data used to fit the model, e.g. quantiles (such as the min or max) of this data. When making predictions based on these transformations, the transformation function will need to know what the pivot points are, based on the original (not prediction) data. In order to accomplish this, we allow the user to specify that they want their transformation function to refer to the original data (as opposed to whatever the "current" data is). This is done by appending a zero ("0") to the argument name.

For example, suppose you want to scale the term linearly so that the data used to define the basis ranges from 0 to 1. The wrong way to define this transformation function: function(x) {(x - min(x))/(max(x) - min(x))}. This function will result in incorrect predictions if the range of data for which preditions are being made is not the same as the range of data that was used to define the basis. The proper way to define this function: function(x) {(x - min(x0))/(max(x0) - min(x0))}. By refering to x0 instead of x, you are indicating that you want to use the original data instead of the current data. This may seem strange to refer to a variable that is not one of the arguments, but the "dt" constructor explicitly places these variables in the environment of the transformation function to make them available.

Author

Jonathan Gellar

Details

object should be creaated with an xt argument. For non-tensor-product smooths, this will be a list with the following elements:

tf (required): a function or character string (or list of functions and/or character strings) defining the coordinate transformations; see further details below.
bs (optional): character string indicating the bs for the basis applied to the transformed coordinates; if empty, the appropriate defaults are used.
basistype (optional): character string indicating type of bivariate basis used. Options include "s" (the default), "te", "ti", and "t2", which correspond to s, te, ti, and t2.
... (optional): for tensor product smooths, additional arguments to the function specified by basistype that are not available in s() can be included here, e.g. d, np, etc.

For tensor product smooths, we recommend using s() to set up the basis, and specifying the tensor product using xt$basistype as described above. If the basis is set up using te(), then the variables in object$term will be split up, meaning all transformation functions would have to be univariate.