conspline: Partial Linear Least-Squares with Constrained Regression Splines

Description

Given a response variable y, a continuous predictor x, and a design matrix Z of parametrically modeled covariates, this function solves a least-squares regression assuming that y=f(x)+Zb+e, where f is a smooth function with a user-defined shape. The shape is assigned with the argument type, where 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8=decreasing and concave.

Usage

conspline(y,x,type,zmat=0,wt=0,knots=0,
   test=FALSE,c=1.2,nsim=10000)

Arguments

A continuous response variable

A continuous predictor variable. The length of x must equal the length of y.

type

An integer 1-8 describing the shape of the regression function in x. 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8= decreasing and concave.

zmat

An optional design matrix of covariates to be modeled parametrically. The number of rows of zmat must be the length of y.

Optional weight vector, must be positive and of the same length as y.

knots

Optional user-defined knots for the spline function. The range of the knots must contain the range of x.

test

If test=TRUE, a test for the "significance" of x is performed. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, for any of the other shapes, the null hypothesis is that the expected value of y is constant in x.

An optional parameter for the variance estimation. Must be between 1 and 2 inclusive.

nsim

An optional specification of the number of simulated data sets to make the mixing distribution for the test statistic if test=TRUE.

Value

muhat

The fitted values at the design points, i.e. an estimate of E(y).

fhat

The estimated regression function, evaluated at the x-values, describing the relationship between E(y) and x, see above description of the model.

fslope

The slope of fhat, evaluated at the x-values.

knots

The knots used in the spline function estimation.

pvalx

If test=TRUE, this is the p-value for the test involving the predictor x. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, versus the alternative that it has the assigned shape. For any of the other shapes, the null hypothesis is that the expected value of y is constant in x, versus the assigned shape.

zcoef

The estimated coefficients for the components of the regression function given by the columns of Z. An "intercept" is given if the column space of Z did not contain the constant vectors.

sighat

The estimate of the model variance. Calculated as SSR/(n-cD), where SSR is the sum of squared residuals of the fit, n is the length of y, D is the observed degrees of freedom of the fit, and c is a parameter between 1 and 2.

zhmat

The hat matrix corresponding the columns of Z, to compute p-values for contrasts, for example.

sez

The standard errors for the Z coefficient estimates. These are square roots of the diagonal values of zhmat, times the square root of sighat.

pvalz

Approximate p-values for the null hypotheses that the coefficients for the covariates represented by the Z columns are zero.

Details

A cone projection is used to fit the least-squares regression model. The test for the significance of x is exact, while the inference for the covariates represented by the Z columns uses statistics that have approximate t-distributions.

References

Meyer, M.C. (2008) Shape-Restricted Regression Splines, Annals of Applied Statistics, 2(3),1013-1033.

Examples

Run this code

# NOT RUN {
n=60
x=1:n/n
z=sample(0:1,n,replace=TRUE)
mu=1:n*0+4
mu[x>1/2]=4+5*(x[x>1/2]-1/2)^2
mu=mu+z/4
y=mu+rnorm(n)/4
plot(x,y,col=z+1)
ans=conspline(y,x,5,z,test=TRUE)
points(x,ans$muhat,pch=20,col=z+1)
lines(x,ans$fhat)
lines(x,ans$fhat+ans$zcoef, col=2)
ans$pvalz  ## p-val for test of significance of z parameter
ans$pvalx  ## p-val for test for linear vs convex regression function
# }

Run the code above in your browser using DataLab