ss: Fit a Smoothing Spline

Description

Fits a smoothing spline with the smoothing parameter selected via one of eight methods: GCV, OCV, GACV, ACV, REML, ML, AIC, or BIC.

Usage

ss(x, y = NULL, w = NULL, df, spar = NULL, lambda = NULL,
   method = c("GCV", "OCV", "GACV", "ACV", "REML", "ML", "AIC", "BIC"), 
   m = 2L, periodic = FALSE, all.knots = FALSE, nknots = .nknots.smspl, 
   knots = NULL, keep.data = TRUE, df.offset = 0, penalty = 1, 
   control.spar = list(), tol = 1e-6 * IQR(x), bernoulli = TRUE,
   xmin = NULL, xmax = NULL, homosced = TRUE, iter.max = 1)

Value

An object of class "ss" with components:

x

the distinct x values in increasing order; see Note.

y

the fitted values corresponding to x.

w

the weights used at the unique values of x.

yin

the y values used at the unique y values.

tol

the tol argument (whose default depends on x).

data

only if keep.data = TRUE: itself a list with components x, y and w (if applicable). These are the original $(x_i,y_i,w_i), i = 1, \ldots, n$, values where data$x may have repeated values and hence be longer than the above x component; see details.

lev

leverages, the diagonal values of the smoother matrix.

cv.crit

cross-validation score.

pen.crit

the penalized criterion, a non-negative number; simply the (weighted) residual sum of squares (RSS).

crit

the criterion value minimized in the underlying df2lambda function. When df is provided, the criterion is $[tr(S_{\lambda}) - df]^2$.

df

equivalent degrees of freedom used.

df.residual

the residual degrees of freedom = nobs - df

spar

the value of spar computed or given, i.e., $s = 1 + \log_{256}(\lambda)/3$

lambda

the value of $\lambda$ corresponding to spar, i.e., $\lambda = 256^{3(s-1)}$.

fit

list for use by predict.ss, with components

n:: number of observations.

knot:

the knot sequence.

nk:

number of coefficients (# knots plus $m$).

coef:

coefficients for the spline basis used.

min, range:

numbers giving the corresponding quantities of x

m:

spline penalty order (same as input m)

periodic:

is spline periodic?

cov.sqrt

square root of covariance matrix of coef such that tcrossprod(coef) reconstructs the covariance matrix.

weighted

were weights w used in fitting?

df.offset

same as input

penalty

same as input

control.spar

control parameters for smoothing parameter selection

bernoulli

were Bernoulli polynomials used in fitting?

call

the matched call.

sigma

estimated error standard deviation.

logLik

log-likelihood (if method is REML or ML).

aic

Akaike's Information Criterion (if method is AIC).

bic

Bayesian Information Criterion (if method is BIC).

penalty

smoothness penalty $\alpha' Q \alpha$, which is the integrated squared $m$-th derivative of the estimated function $f(x)$.

method

smoothing parameter selection method. Will be NULL if df, spar, or lambda is provided.

Arguments

x

Predictor vector of length $n$. Can also input a list or a two-column matrix specifying x and y.

y

Response vector of length $n$. If y is missing or NULL, the responses are assumed to be specified by x, with x the index vector.

w

Weights vector of length $n$. Defaults to all 1.

df

Equivalent degrees of freedom (trace of the smoother matrix). Must be in $[m,nx]$, where $nx$ is the number of unique x values, see below.

spar

Smoothing parameter. Typically (but not always) in the range $(0,1]$. If specified lambda = 256^(3*(spar-1)).

lambda

Computational smoothing parameter. This value is weighted by $n$ to form the penalty coefficient (see Details). Ignored if spar is provided.

method

Method for selecting the smoothing parameter. Ignored if spar or lambda is provided.

m

Penalty order (integer). The penalty functional is the integrated squared $m$-th derivative of the function. Defaults to $m = 2$, which is a cubic smoothing spline. Set $m = 1$ for a linear smoothing spline or $m = 3$ for a quintic smoothing spline.

periodic

Logical. If TRUE, the estimated function $f(x)$ is constrained to be periodic, i.e., $f(a) = f(b)$ where $a = \min(x)$ and $b = \max(x)$.

all.knots

If TRUE, all distinct points in x are used as knots. If FALSE (default), a sequence knots is placed at the quantiles of the unique x values; in this case, the input nknots specifies the number of knots in the sequence. Ignored if the knot values are input using the knots argument.

nknots

Positive integer or function specifying the number of knots. Ignored if either all.knots = TRUE or the knot values are input using the knots argument.

knots

Vector of knot values for the spline. Should be unique and within the range of the x values (to avoid a warning).

keep.data

Logical. If TRUE, the original data as a part of the output object.

df.offset

Allows the degrees of freedom to be increased by df.offset in the GCV criterion.

penalty

The coefficient of the penalty for degrees of freedom in the GCV criterion.

control.spar

Optional list with named components controlling the root finding when the smoothing parameter spar is computed, i.e., missing or NULL, see below.

Note that spar is only searched for in the interval $[lower, upper]$.

lower:: lower bound for spar; defaults to -1.5

upper:

upper bound for spar; defaults to 1.5

tol:

the absolute precision (tolerance) used by optimize; defaults to 1e-8.

tol

Tolerance for same-ness or uniqueness of the x values. The values are binned into bins of size tol and values which fall into the same bin are regarded as the same. Must be strictly positive (and finite).

bernoulli

If TRUE, scaled Bernoulli polynomials are used for the basis and penalty functions. If FALSE, produces the "classic" definition of a smoothing spline, where the function estimate is a piecewise polynomial function with pieces of degree $2m - 1$. See polynomial for details.

xmin

Minimum x value used to transform predictor scores to [0,1]. If NULL, xmin = min(x).

xmax

Maximum x value used to transform predictor scores to [0,1]. If NULL, xmax = max(x).

homosced

Are error variances homoscedastic? If FALSE, variance weights are (iteratively?) estimated from the data.

iter.max

Maximum number of iterations for variance weight estimation. Ignored if homosced = TRUE.

Author

Nathaniel E. Helwig <helwig@umn.edu>

Methods

The smoothing parameter can be selected using one of eight methods:
Generalized Cross-Validation (GCV)
Ordinary Cross-Validation (OCV)
Generalized Approximate Cross-Validation (GACV)
Approximate Cross-Validation (ACV)
Restricted Maximum Likelihood (REML)
Maximum Likelihood (ML)
Akaike's Information Criterion (AIC)
Bayesian Information Criterion (BIC)

Details

Inspired by the smooth.spline function in R's stats package.

Neither x nor y are allowed to containing missing or infinite values.

The x vector should contain at least $2m$ distinct values. 'Distinct' here is controlled by tol: values which are regarded as the same are replaced by the first of their values and the corresponding y and w are pooled accordingly.

Unless lambda has been specified instead of spar, the computational $\lambda$ used (as a function of spar) is $\lambda = 256^{3(s - 1)}$, where $s = $ spar.

If spar and lambda are missing or NULL, the value of df is used to determine the degree of smoothing. If df is missing as well, the specified method is used to determine $\lambda$.

Letting $f_i = f(x_i)$, the function is represented as $$f = X \beta + Z \alpha$$ where the basis functions in $X$ span the null space (i.e., functions with $m$-th derivative of zero), and $Z$ contains the reproducing kernel function of the contrast space evaluated at all combinations of observed data points and knots, i.e., $Z[i,j] = R(x_i, k_j)$ where $R$ is the kernel function and $k_j$ is the $j$-th knot. The vectors $\beta$ and $\alpha$ contain unknown basis function coefficients. Letting $M = (X, Z)$ and $\gamma = (\beta', \alpha')'$, the penalized least squares problem has the form $$ (y - M \gamma)' W (y - M \gamma) + n \lambda \alpha' Q \alpha $$ where $W$ is a diagonal matrix containg the weights, and $Q$ is the penalty matrix. Note that $Q[i,j] = R(k_i, k_j)$ contains the reproducing kernel function evaluated at all combinations of knots. The optimal coefficients are the solution to $$ (M' W M + n \lambda P) \gamma = M' W y $$ where $P$ is the penalty matrix $Q$ augmented with zeros corresponding to the $\beta$ in $\gamma$.

References

https://stat.ethz.ch/R-manual/R-devel/library/stats/html/smooth.spline.html

Berry, L. N., & Helwig, N. E. (2021). Cross-validation, information theory, or maximum likelihood? A comparison of tuning methods for penalized splines. Stats, 4(3), 701-724. tools:::Rd_expr_doi("10.3390/stats4030042")

Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. tools:::Rd_expr_doi("10.1007/BF01404567")

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. tools:::Rd_expr_doi("10.1007/978-1-4614-5369-7")

Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. tools:::Rd_expr_doi("10.4135/9781526421036885885")

Helwig, N. E. (2021). Spectrally sparse nonparametric regression via elastic net regularized smoothers. Journal of Computational and Graphical Statistics, 30(1), 182-191. tools:::Rd_expr_doi("10.1080/10618600.2020.1806855")

Wahba, G. (1985). A comparison of GCV and GML for choosing the smoothing parameters in the generalized spline smoothing problem. The Annals of Statistics, 4, 1378-1402. tools:::Rd_expr_doi("10.1214/aos/1176349743")

Examples

Run this code

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# GCV selection (default)
ss.GCV <- ss(x, y, nknots = 10)
ss.GCV

# OCV selection
ss.OCV <- ss(x, y, method = "OCV", nknots = 10)
ss.OCV

# GACV selection
ss.GACV <- ss(x, y, method = "GACV", nknots = 10)
ss.GACV

# ACV selection
ss.ACV <- ss(x, y, method = "ACV", nknots = 10)
ss.ACV

# ML selection
ss.ML <- ss(x, y, method = "ML", nknots = 10)
ss.ML

# REML selection
ss.REML <- ss(x, y, method = "REML", nknots = 10)
ss.REML

# AIC selection
ss.AIC <- ss(x, y, method = "AIC", nknots = 10)
ss.AIC

# BIC selection
ss.BIC <- ss(x, y, method = "BIC", nknots = 10)
ss.BIC

# compare results
mean( ( fx - ss.GCV$y )^2 )
mean( ( fx - ss.OCV$y )^2 )
mean( ( fx - ss.GACV$y )^2 )
mean( ( fx - ss.ACV$y )^2 )
mean( ( fx - ss.ML$y )^2 )
mean( ( fx - ss.REML$y )^2 )
mean( ( fx - ss.AIC$y )^2 )
mean( ( fx - ss.BIC$y )^2 )

# plot results
plot(x, y)
rlist <- list(ss.GCV, ss.OCV, ss.GACV, ss.ACV,
              ss.REML, ss.ML, ss.AIC, ss.BIC)
for(j in 1:length(rlist)){
   lines(rlist[[j]], lwd = 2, col = j)
}

Run the code above in your browser using DataLab