The dt
basis allows for any of the standard mgcv
(or
user-defined) bases to be aplied to a transformed version of the
original terms. Smooths may be of any number of terms. Transformations
are specified by supplying a function of any or all of the original terms.
"by
" variables are not transformed.
# S3 method for dt.smooth.spec
smooth.construct(object, data, knots)
An object of class "dt.smooth". This will contain all the elements
associated with the smooth.construct
object from the
inner smooth (defined by xt$bs
), in addition to an xt
element used by the Predict.matrix
method.
a smooth specification object, generated by s()
,
te()
, ti()
, or t2()
, with bs="dt"
a list containing just the data (including any by variable)
required by this term, with names corresponding to object$term
(and object$by
). The by
variable is the last element.
a list containing any knots supplied for basis setup - in same
order and with same names as data
. Can be NULL
.
Let nterms = length(object$term)
. The tf
element can take one
of the following forms:
a function of nargs
arguments, where nargs <= nterms
.
If nterms > 1
, it is assumed that this function will be applied to
the first term of object$term
. If all argument names of the
function are term names, then those arguments will correspond to those
terms; otherwise, they will correspond to the first nargs
terms in
object$term
.
a character string corresponding to one of the built-in transformations (listed below).
A list of length ntfuncs
, where ntfuncs<=nterms
,
containing either the functions or character strings described above. If
this list is named with term names, then the transformation functions
will be applied to those terms; otherwise, they will be applied to the
first ntfuncs
terms in object$term
.
The following character strings are recognized as built-in transformations:
"log"
: log transformation (univariate)
"ecdf"
: empirical cumulative distribution function (univariate)
"linear01"
: linearly rescale from 0 to 1 (univariate)
"s-t"
: first term ("s") minus the second term ("t") (bivariate)
"s/t"
: first term ("s") divided by the second term ("t") (bivariate)
"QTransform"
: performs a time-specific ecdf transformation for
a bivariate smooth, where time is indicated by the first term, and
\(x\) by the second term. Primarily for use with refund::af
.
Some transformations rely on a fixed "pivot point" based on the data used to fit the model, e.g. quantiles (such as the min or max) of this data. When making predictions based on these transformations, the transformation function will need to know what the pivot points are, based on the original (not prediction) data. In order to accomplish this, we allow the user to specify that they want their transformation function to refer to the original data (as opposed to whatever the "current" data is). This is done by appending a zero ("0") to the argument name.
For example, suppose you want to scale
the term linearly so that the data used to define the basis ranges from
0 to 1. The wrong way to define this transformation function:
function(x) {(x - min(x))/(max(x) - min(x))}
.
This function will result in incorrect predictions if the range of data for
which preditions are being made is not the same as the range of data that was
used to define the basis. The proper way to define this function:
function(x) {(x - min(x0))/(max(x0) - min(x0))}
.
By refering to x0
instead of x
, you are indicating that you
want to use the original data instead of the current data. This may seem
strange to refer to a variable that is not one of the arguments, but the
"dt"
constructor explicitly places these variables in the environment
of the transformation function to make them available.
Jonathan Gellar
object
should be creaated with an xt
argument. For
non-tensor-product smooths, this will be a list with the following elements:
tf
(required): a function or character string (or list of functions
and/or character strings) defining the coordinate transformations; see
further details below.
bs
(optional): character string indicating the bs
for
the basis applied to the transformed coordinates; if empty, the appropriate
defaults are used.
basistype
(optional): character string indicating type of
bivariate basis used. Options include "s"
(the default), "te"
,
"ti"
, and "t2"
, which correspond to s
,
te
, ti
, and t2
.
...
(optional): for tensor product smooths, additional arguments
to the function specified by basistype
that are not available in
s()
can be included here, e.g. d
, np
, etc.
For tensor product smooths, we recommend using s()
to set up the basis,
and specifying the tensor product using xt$basistype
as described
above. If the basis is set up using te()
, then the variables in
object$term
will be split up, meaning all transformation functions
would have to be univariate.