Fits a smoothing spline with the smoothing parameter selected via one of eight methods: GCV, OCV, GACV, ACV, REML, ML, AIC, or BIC.
ss(x, y = NULL, w = NULL, df, spar = NULL, lambda = NULL,
method = c("GCV", "OCV", "GACV", "ACV", "REML", "ML", "AIC", "BIC"),
m = 2L, periodic = FALSE, all.knots = FALSE, nknots = .nknots.smspl,
knots = NULL, keep.data = TRUE, df.offset = 0, penalty = 1,
control.spar = list(), tol = 1e-6 * IQR(x), bernoulli = TRUE,
xmin = NULL, xmax = NULL, homosced = TRUE, iter.max = 1)
An object of class "ss" with components:
the distinct x
values in increasing order; see Note.
the fitted values corresponding to x
.
the weights used at the unique values of x
.
the y
values used at the unique y
values.
the tol
argument (whose default depends on x
).
only if keep.data = TRUE: itself a list with components x
, y
and w
(if applicable). These are the original \((x_i,y_i,w_i), i = 1, \ldots, n\), values where data$x
may have repeated values and hence be longer than the above x
component; see details.
leverages, the diagonal values of the smoother matrix.
cross-validation score.
the penalized criterion, a non-negative number; simply the (weighted) residual sum of squares (RSS).
the criterion value minimized in the underlying df2lambda
function. When df
is provided, the criterion is \([tr(S_{\lambda}) - df]^2\).
equivalent degrees of freedom used.
the residual degrees of freedom = nobs - df
the value of spar
computed or given, i.e., \(s = 1 + \log_{256}(\lambda)/3\)
the value of \(\lambda\) corresponding to spar
, i.e., \(\lambda = 256^{3(s-1)}\).
list for use by predict.ss
, with components
number of observations.
the knot sequence.
number of coefficients (# knots plus \(m\)).
coefficients for the spline basis used.
numbers giving the corresponding quantities of x
spline penalty order (same as input m
)
is spline periodic?
square root of covariance matrix of coef
such that tcrossprod(coef)
reconstructs the covariance matrix.
were weights w
used in fitting?
same as input
same as input
control parameters for smoothing parameter selection
were Bernoulli polynomials used in fitting?
the matched call.
estimated error standard deviation.
log-likelihood (if method
is REML or ML).
Akaike's Information Criterion (if method
is AIC).
Bayesian Information Criterion (if method
is BIC).
smoothness penalty \(\alpha' Q \alpha\), which is the integrated squared \(m\)-th derivative of the estimated function \(f(x)\).
smoothing parameter selection method. Will be NULL
if df
, spar
, or lambda
is provided.
Predictor vector of length \(n\). Can also input a list or a two-column matrix specifying x and y.
Response vector of length \(n\). If y is missing or NULL, the responses are assumed to be specified by x, with x the index vector.
Weights vector of length \(n\). Defaults to all 1.
Equivalent degrees of freedom (trace of the smoother matrix). Must be in \([m,nx]\), where \(nx\) is the number of unique x values, see below.
Smoothing parameter. Typically (but not always) in the range \((0,1]\). If specified lambda = 256^(3*(spar-1))
.
Computational smoothing parameter. This value is weighted by \(n\) to form the penalty coefficient (see Details). Ignored if spar
is provided.
Method for selecting the smoothing parameter. Ignored if spar
or lambda
is provided.
Penalty order (integer). The penalty functional is the integrated squared \(m\)-th derivative of the function. Defaults to \(m = 2\), which is a cubic smoothing spline. Set \(m = 1\) for a linear smoothing spline or \(m = 3\) for a quintic smoothing spline.
Logical. If TRUE
, the estimated function \(f(x)\) is constrained to be periodic, i.e., \(f(a) = f(b)\) where \(a = \min(x)\) and \(b = \max(x)\).
If TRUE
, all distinct points in x are used as knots. If FALSE
(default), a sequence knots is placed at the quantiles of the unique x values; in this case, the input nknots
specifies the number of knots in the sequence. Ignored if the knot values are input using the knots
argument.
Positive integer or function specifying the number of knots. Ignored if either all.knots = TRUE
or the knot values are input using the knots
argument.
Vector of knot values for the spline. Should be unique and within the range of the x values (to avoid a warning).
Logical. If TRUE
, the original data as a part of the output object.
Allows the degrees of freedom to be increased by df.offset
in the GCV criterion.
The coefficient of the penalty for degrees of freedom in the GCV criterion.
Optional list with named components controlling the root finding when the smoothing parameter spar is computed, i.e., missing or NULL, see below.
Note that spar is only searched for in the interval \([lower, upper]\).
lower bound for spar; defaults to -1.5
upper bound for spar; defaults to 1.5
the absolute precision (tolerance) used by optimize
; defaults to 1e-8.
Tolerance for same-ness or uniqueness of the x values. The values are binned into bins of size tol and values which fall into the same bin are regarded as the same. Must be strictly positive (and finite).
If TRUE
, scaled Bernoulli polynomials are used for the basis and penalty functions. If FALSE
, produces the "classic" definition of a smoothing spline, where the function estimate is a piecewise polynomial function with pieces of degree \(2m - 1\). See polynomial
for details.
Minimum x value used to transform predictor scores to [0,1]. If NULL, xmin = min(x)
.
Maximum x value used to transform predictor scores to [0,1]. If NULL, xmax = max(x)
.
Are error variances homoscedastic? If FALSE
, variance weights are (iteratively?) estimated from the data.
Maximum number of iterations for variance weight estimation. Ignored if homosced = TRUE
.
Nathaniel E. Helwig <helwig@umn.edu>
The smoothing parameter can be selected using one of eight methods:
Generalized Cross-Validation (GCV)
Ordinary Cross-Validation (OCV)
Generalized Approximate Cross-Validation (GACV)
Approximate Cross-Validation (ACV)
Restricted Maximum Likelihood (REML)
Maximum Likelihood (ML)
Akaike's Information Criterion (AIC)
Bayesian Information Criterion (BIC)
Inspired by the smooth.spline
function in R's stats package.
Neither x
nor y
are allowed to containing missing or infinite values.
The x
vector should contain at least \(2m\) distinct values. 'Distinct' here is controlled by tol
: values which are regarded as the same are replaced by the first of their values and the corresponding y
and w
are pooled accordingly.
Unless lambda
has been specified instead of spar
, the computational \(\lambda\) used (as a function of spar
) is \(\lambda = 256^{3(s - 1)}\), where \(s = \) spar
.
If spar
and lambda
are missing or NULL
, the value of df
is used to determine the degree of smoothing. If df
is missing as well, the specified method
is used to determine \(\lambda\).
Letting \(f_i = f(x_i)\), the function is represented as $$f = X \beta + Z \alpha$$ where the basis functions in \(X\) span the null space (i.e., functions with \(m\)-th derivative of zero), and \(Z\) contains the reproducing kernel function of the contrast space evaluated at all combinations of observed data points and knots, i.e., \(Z[i,j] = R(x_i, k_j)\) where \(R\) is the kernel function and \(k_j\) is the \(j\)-th knot. The vectors \(\beta\) and \(\alpha\) contain unknown basis function coefficients. Letting \(M = (X, Z)\) and \(\gamma = (\beta', \alpha')'\), the penalized least squares problem has the form $$ (y - M \gamma)' W (y - M \gamma) + n \lambda \alpha' Q \alpha $$ where \(W\) is a diagonal matrix containg the weights, and \(Q\) is the penalty matrix. Note that \(Q[i,j] = R(k_i, k_j)\) contains the reproducing kernel function evaluated at all combinations of knots. The optimal coefficients are the solution to $$ (M' W M + n \lambda P) \gamma = M' W y $$ where \(P\) is the penalty matrix \(Q\) augmented with zeros corresponding to the \(\beta\) in \(\gamma\).
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/smooth.spline.html
Berry, L. N., & Helwig, N. E. (2021). Cross-validation, information theory, or maximum likelihood? A comparison of tuning methods for penalized splines. Stats, 4(3), 701-724. tools:::Rd_expr_doi("10.3390/stats4030042")
Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. tools:::Rd_expr_doi("10.1007/BF01404567")
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. tools:::Rd_expr_doi("10.1007/978-1-4614-5369-7")
Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. tools:::Rd_expr_doi("10.4135/9781526421036885885")
Helwig, N. E. (2021). Spectrally sparse nonparametric regression via elastic net regularized smoothers. Journal of Computational and Graphical Statistics, 30(1), 182-191. tools:::Rd_expr_doi("10.1080/10618600.2020.1806855")
Wahba, G. (1985). A comparison of GCV and GML for choosing the smoothing parameters in the generalized spline smoothing problem. The Annals of Statistics, 4, 1378-1402. tools:::Rd_expr_doi("10.1214/aos/1176349743")
Related Modeling Functions:
sm
for fitting smooth models with multiple predictors of mixed types (Gaussian response).
gsm
for fitting generalized smooth models with multiple predictors of mixed types (non-Gaussian response).
S3 Methods and Related Functions for "ss" Objects:
boot.ss
for bootstrapping ss
objects.
coef.ss
for extracting coefficients from ss
objects.
cooks.distance.ss
for calculating Cook's distances from ss
objects.
cov.ratio
for computing covariance ratio from ss
objects.
deviance.ss
for extracting deviance from ss
objects.
dfbeta.ss
for calculating DFBETA from ss
objects.
dfbetas.ss
for calculating DFBETAS from ss
objects.
diagnostic.plots
for plotting regression diagnostics from ss
objects.
fitted.ss
for extracting fitted values from ss
objects.
hatvalues.ss
for extracting leverages from ss
objects.
model.matrix.ss
for constructing model matrix from ss
objects.
plot.ss
for plotting predictions from ss
objects.
plot.boot.ss
for plotting boot.ss
objects.
predict.ss
for predicting from ss
objects.
residuals.ss
for extracting residuals from ss
objects.
rstandard.ss
for computing standardized residuals from ss
objects.
rstudent.ss
for computing studentized residuals from ss
objects.
smooth.influence
for calculating basic influence information from ss
objects.
smooth.influence.measures
for convenient display of influential observations from ss
objects.
summary.ss
for summarizing ss
objects.
vcov.ss
for extracting coefficient covariance matrix from ss
objects.
weights.ss
for extracting prior weights from ss
objects.
# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)
# GCV selection (default)
ss.GCV <- ss(x, y, nknots = 10)
ss.GCV
# OCV selection
ss.OCV <- ss(x, y, method = "OCV", nknots = 10)
ss.OCV
# GACV selection
ss.GACV <- ss(x, y, method = "GACV", nknots = 10)
ss.GACV
# ACV selection
ss.ACV <- ss(x, y, method = "ACV", nknots = 10)
ss.ACV
# ML selection
ss.ML <- ss(x, y, method = "ML", nknots = 10)
ss.ML
# REML selection
ss.REML <- ss(x, y, method = "REML", nknots = 10)
ss.REML
# AIC selection
ss.AIC <- ss(x, y, method = "AIC", nknots = 10)
ss.AIC
# BIC selection
ss.BIC <- ss(x, y, method = "BIC", nknots = 10)
ss.BIC
# compare results
mean( ( fx - ss.GCV$y )^2 )
mean( ( fx - ss.OCV$y )^2 )
mean( ( fx - ss.GACV$y )^2 )
mean( ( fx - ss.ACV$y )^2 )
mean( ( fx - ss.ML$y )^2 )
mean( ( fx - ss.REML$y )^2 )
mean( ( fx - ss.AIC$y )^2 )
mean( ( fx - ss.BIC$y )^2 )
# plot results
plot(x, y)
rlist <- list(ss.GCV, ss.OCV, ss.GACV, ss.ACV,
ss.REML, ss.ML, ss.AIC, ss.BIC)
for(j in 1:length(rlist)){
lines(rlist[[j]], lwd = 2, col = j)
}
Run the code above in your browser using DataLab