lms.yjn: LMS Quantile Regression with a Yeo-Johnson Transformation to Normality

Description

LMS quantile regression with the Yeo-Johnson transformation to normality.

Usage

lms.yjn(percentiles = c(25, 50, 75), zero = NULL, 
        link.lambda = "identity", link.sigma = "loge",
        elambda=list(), esigma=list(),
        dfmu.init=4, dfsigma.init=2,
        init.lambda = 1, init.sigma = NULL, 
        rule = c(10, 5), yoffset = NULL,
        diagW=FALSE, iters.diagW=6)
lms.yjn2(percentiles=c(25,50,75), zero=NULL,
         link.lambda="identity", link.sigma="loge",
         elambda=list(), esigma=list(),
         dfmu.init=4, dfsigma.init=2,
         init.lambda=1.0, init.sigma=NULL,
         yoffset=NULL, nsimEIM=250)

Arguments

percentiles

A numerical vector containing values between 0 and 100, which are the quantiles. They will be returned as `fitted values'.

zero

An integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. The values must be from the set {1,2,3}. The default value, NULL, means they all are functions of the covariates.

link.lambda

Parameter link function applied to the first linear/additive predictor. See Links for more choices.

link.sigma

Parameter link function applied to the third linear/additive predictor. See Links for more choices.

elambda, esigma

List. Extra argument for each of the links. See earg in Links for general information.

dfmu.init

Degrees of freedom for the cubic smoothing spline fit applied to get an initial estimate of mu. See vsmooth.spline.

dfsigma.init

Degrees of freedom for the cubic smoothing spline fit applied to get an initial estimate of sigma. See vsmooth.spline. This argument may be assigned NULL to get an initial value

init.lambda

Initial value for lambda. If necessary, it is recycled to be a vector of length $n$.

init.sigma

Optional initial value for sigma. If necessary, it is recycled to be a vector of length $n$. The default value, NULL, means an initial value is computed in the @initialize slot of the family function.

rule

Number of abscissae used in the Gaussian integration scheme to work out elements of the weight matrices. The values given are the possible choices, with the first value being the default. The larger the value, the more accurate the approximation

yoffset

A value to be added to the response y, for the purpose of centering the response before fitting the model to the data. The default value, NULL, means -median(y) is used, so that the response actually used has median zero. T

diagW

Logical. This argument is offered because the expected information matrix may not be positive-definite. Using the diagonal elements of this matrix results in a higher chance of it being positive-definite, however convergence will be very slow. I

iters.diagW

Integer. Number of iterations in which the diagonal elements of the expected information matrix are used. Only used if diagW = TRUE.

nsimEIM

See CommonVGAMffArguments for more information.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm and vgam.

Warning

The computations are not simple, therefore convergence may fail. In that case, try different starting values.

The generic function predict, when applied to a lms.yjn fit, does not add back the yoffset value.

Details

Given a value of the covariate, this function applies a Yeo-Johnson transformation to the response to best obtain normality. The parameters chosen to do this are estimated by maximum likelihood or penalized maximum likelihood. The function lms.yjn2() estimates the expected information matrices using simulation (and is consequently slower) while lms.yjn() uses numerical integration. Try the other if one function fails.

References

Yeo, I.-K. and Johnson, R. A. (2000) A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954--959.

Yee, T. W. (2004) Quantile regression via vector generalized additive models. Statistics in Medicine, 23, 2295--2315.

Yee, T. W. (2002) An Implementation for Regression Quantile Estimation. Pages 3--14. In: Haerdle, W. and Ronz, B., Proceedings in Computational Statistics COMPSTAT 2002. Heidelberg: Physica-Verlag.

Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.

Examples

Run this code

data(bminz)
fit = vgam(BMI ~ s(age, df=4), fam=lms.yjn(zero=c(1,3)),
           data=bminz, trace=TRUE)
predict(fit)[1:3,]
fitted(fit)[1:3,]
bminz[1:3,]
# Person 1 is near the lower quartile of BMI amongst people his age
cdf(fit)[1:3]

# Quantile plot
par(bty="l", mar=c(5,4,4,3)+0.1, xpd=TRUE)
qtplot(fit, percentiles=c(5,50,90,99), main="Quantiles",
       xlim=c(15,90), las=1, ylab="BMI", lwd=2, lcol=4)

# Density plot
ygrid = seq(15, 43, len=100)  # BMI ranges
par(mfrow=c(1,1), lwd=2)
a = deplot(fit, x0=20, y=ygrid, xlab="BMI", col="black",
    main="Density functions at Age = 20 (black), 42 (red) and 55 (blue)")
a
a = deplot(fit, x0=42, y=ygrid, add=TRUE, llty=2, col="red")
a = deplot(fit, x0=55, y=ygrid, add=TRUE, llty=4, col="blue", Attach=TRUE)
a@post$deplot  # Contains density function values

Run the code above in your browser using DataLab