Learn R Programming

cosso (version 2.1-2)

SSANOVAwt: Compute adaptive weights by fitting a SS-ANOVA model

Description

A preliminary estimate \(\tilde{\eta}\) is first obtained by fitting a smoothing spline ANOVA model, and then use the inverse \(L_2\)-norm, \(||\tilde{\eta}_j||^{-\gamma}\), as the initial weight for the \(j\)-th functional component.

Usage

SSANOVAwt(x,y,tau,family=c("Gaussian","Binomial","Cox","Quantile"),mscale=rep(1,ncol(x)),
               gamma=1,scale=FALSE,nbasis,basis.id,cpus)

Value

wt

The adaptive weights.

Arguments

x

input matrix; the number of rows is sample size, the number of columns is the data dimension. The range of input variables is scaled to [0,1] for continuous variables.

y

response vector. Quantitative for family="Gaussian" or family="Quantile". For family="Binomial" should be a vector with two levels. For family="Cox", y should be a two-column matrix (data frame) with columns named 'time' and 'status'

tau

the quantile to be estimated, a number strictly between 0 and 1. Argument required when family="Quantile".

family

response type. Abbreviations are allowed.

mscale

scale parameter for the Gram matrix associated with each function component. Default is rep(1,ncol(x))

gamma

power of inverse \(L_2\)-norm. Default is 1.

scale

if TRUE, continuous predictors will be rescaled to [0,1] interval. Default is FALSE.

nbasis

number of "knots" to be selected. Ignored when basis.id is provided.

basis.id

index designating selected "knots". Argument is not valid if family="Quantile".

cpus

number of available processor units. Default is 1. If cpus>=2, parallelize task using "parallel" package. Recommended when either sample size or number of covariates is large. Argument is not valid if family="Gaussian" or family="Binomial".

Author

Hao Helen Zhang and Chen-Yen Lin

Details

The initial mean function is estimated via a smooothing spline objective function. In the SS-ANOVA model framework, the regression function is assumed to have an additive form $$\eta(x)=b+\sum_{j=1}^p\eta_j(x^{(j)}),$$ where \(b\) denotes intercept and \(\eta_j\) denotes the main effect of the \(j\)-th covariate.

For "Gaussian" response, the mean regression function is estimated by minimizing the objective function: $$\sum_i(y_i-\eta(x_i))^2/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.$$ where RSS is residual sum of squares.

For "Binomial" response, the regression function is estimated by minimizing the objective function: $$-log-likelihood/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2$$

For "Quantile" regression model, the quantile function, is estimated by minimizing the objective function: $$\sum_i\rho(y_i-\eta(x_i))/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.$$

For "Cox" regression model, the log-hazard function, is estimated by minimizing the objective function: $$-log-Partial Likelihood/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.$$

The smoothing parameter \(\lambda_0\) is tuned by 5-fold Cross-Validation, if family="Gaussian", "Binomial" or "Quantile", and Approximate Cross-Validation, if family="Cox". But the smoothing parameters \(\alpha_j\) are given in the argument mscale.

The adaptive weights are then fiven by \(||\tilde{\eta}_j||^{-\gamma}\).

References

Storlie, C. B., Bondell, H. D., Reich, B. J. and Zhang, H. H. (2011) "Surface Estimation, Variable Selection, and the Nonparametric Oracle Property", Statistica Sinica, 21, 679--705.

Examples

Run this code
## Adaptive COSSO Model
## Binomial
set.seed(20130310)
x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7))
trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2))
y=rbinom(200,1,trueProb)

Binomial.wt=SSANOVAwt(x,y,family="Bin")
ada.B.Obj=cosso(x,y,wt=Binomial.wt,family="Bin")

if (FALSE) {
## Gaussian
set.seed(20130310)
x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7))
y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1)
Gaussian.wt=SSANOVAwt(designx,response,family="Gau")
ada.G.Obj=cosso(x,y,wt=Gaussian.wt,family="Gaussian")
}

Run the code above in your browser using DataLab