s(x,z,bs="ds",m=c(1,.5)
in a gam
model formula.
In the notation of Duchon (1977) m is given by m[1]
(default value 2), while s is given by m[2]
(default value 0).Duchon's (1977) construction generalizes the usual thin plate spline penalty as follows. The usual TPS penalty is given by the integral of the squared Euclidian norm of a vector of mixed partial derivatives of the function w.r.t. its arguments. Duchon re-expresses this penalty in the Fourier domain, and then weights the squared norm in the integral by the Euclidean norm of the fourier frequencies, raised to the power 2s. s is a user selected constant taking integer values divided by 2. If d is the numberof arguments of the smooth, then it is required that -d/2 < s < d/2. To obtain continuous functions we further require that m + s > d/2. If s=0 then the usual thin plate spline is recovered.
The construction is amenable to exactly the low rank approximation method given in Wood (2003) to thin plate splines, with similar
optimality properties, so this approach to low rank smoothing is used here. For large datasets the same subsampling approach as is used in the
tprs
case is employed here to reduce computational costs.
These smoothers allow the use of lower orders of derivative in the penalty than conventional thin plate splines, while still yielding continuous functions.
## S3 method for class 'ds.smooth.spec':
smooth.construct(object, data, knots)
## S3 method for class 'duchon.spline':
Predict.matrix(object, data)
s(...,bs="ds",...)
.by
variable) required by this term,
with names corresponding to object$term
(and object$by
). The by
variable
is the last element.data
.
Can be NULL
"duchon.spline"
. In addition to the usual elements of a
smooth class documented under smooth.construct
, this object will contain:k=M+k.def
where M
is the null space dimension
(dimension of unpenalized function space) and k.def
is 10 for dimension 1, 30 for dimension 2 and 100 for higher dimensions.
This is essentially arbitrary, and should be checked, but as with all penalized regression smoothers, results are statistically
insensitive to the exact choise, provided it is not so small that it forces oversmoothing (the smoother's
degrees of freedom are controlled primarily by its smoothing parameter).The constructor is not normally called directly, but is rather used internally by gam
.
To use for basis setup it is recommended to use smooth.construct2
.
For these classes the specification object
will contain
information on how to handle large datasets in their xt
field. The default is to randomly
subsample 2000 `knots' from which to produce a reduced rank eigen approximation to the full basis,
if the number of unique predictor variable combinations in excess of 2000. The default can be
modified via the xt
argument to s
. This is supplied as a
list with elements max.knots
and seed
containing a number
to use in place of 2000, and the random number seed to use (either can be
missing). Note that the random sampling will not effect the state of R's RNG.
For these bases knots
has two uses. Firstly, as mentioned already, for large datasets
the calculation of the tp
basis can be time-consuming. The user can retain most of the advantages of the
approach by supplying a reduced set of covariate values from which to obtain the basis -
typically the number of covariate values used will be substantially
smaller than the number of data, and substantially larger than the basis dimension, k
. This approach is
the one taken automatically if the number of unique covariate values (combinations) exceeds max.knots
.
The second possibility is to avoid the eigen-decomposition used to find the spline basis altogether and simply use
the basis implied by the chosen knots: this will happen if the number of knots supplied matches the
basis dimension, k
. For a given basis dimension the second option is
faster, but gives poorer results (and the user must be quite careful in choosing knot locations).
Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114
eg <- gamSim(2,n=200,scale=.05)
attach(eg)
op <- par(mfrow=c(2,2),mar=c(4,4,1,1))
b0 <- gam(y~s(x,z,bs="ds",m=c(2,0),k=50),data=data) ## tps
b <- gam(y~s(x,z,bs="ds",m=c(1,.5),k=50),data=data) ## first deriv penalty
b1 <- gam(y~s(x,z,bs="ds",m=c(2,.5),k=50),data=data) ## modified 2nd deriv
persp(truth$x,truth$z,truth$f,theta=30) ## truth
vis.gam(b0,theta=30)
vis.gam(b,theta=30)
vis.gam(b1,theta=30)
detach(eg)
Run the code above in your browser using DataLab