The quantile sheets function quantSheets()
is based on the work of Sabine
Schnabe and Paul Eiler (see references below). The estimation of the quantile curves
is done simultaneously by also smoothing in the direction of y as well as x. This avoids (but do not eliminate completely) the problem of crossing quantiles.
quantSheets(y, x, x.lambda = 1, p.lambda = 1, data = NULL,
cent = 100 * pnorm((-4:4) * 2/3),
control = quantSheets.control(...), print = TRUE, ...)quantSheets.control(x.inter = 10, p.inter = 10, degree = 3, logit = FALSE,
order = 2, kappa = 0, n.cyc = 100, c.crit = 1e-05, plot = TRUE,
power = NULL, ...)
findPower(y, x, data = NULL, lim.trans = c(0, 1.5), prof = FALSE,
k = 2, c.crit = 0.01, step = 0.1)
z.scoresQS(object, y, x, plot = FALSE, tol = NULL)
the y variable
the x variable
smoothing parameter in the direction of x
smoothing parameter in the direction of y (probabilities)
the data frame
the centile values where the quantile sheets is evaluated
for the parameters controlling the algorithm
whether to print the sample percentages
number of intervals in the x direction for the B-splines
number of intervals in the probabilities (y-direction) for the B-splines
the degree for the B-splines
whether to use logit(p)
instead of p
(probabilities) for the y-axis
the order of the penalty
is a ridge parameter set to zero (for no ridge effect)
number of cycles of the algorithm
convergence criterion of the algorithm
whether to plot the resulting quantile sheets
The value of the power transformation in the x axis if needed
the limits for looking for the power transformation
parameter using findPower()
whether to use the profile GAIC or optim()
to the parameter
the power transformation
the GAIC penalty
the steps for the profile GAIC if the argument prof
of
findPower()
is TRUE
a fitted quantSheets
object
how far out from the range of the y variable should go for
estimating the distribution of y using the flexDist()
function
for further arguments
Using the function quantSheets()
a quantSheets
object is returned having the following methods:
print()
, fitted()
, predict()
and resid()
.
Using findPower()
a single values of the power parameter is returned.
Using z.scoresQS
a vector of z-scores is returned.
The advantage of quantile sheets is that they estimates simultaneously all the quantiles. This almost eliminates the problem of crossing quantiles. The method is very fast and useful for exploratory tool. The function needs two smoothing parameters. Those two parameters have to specified by the user. They are not estimated automatically. They can be selected by visual inspection.
The disadvantages of quantile sheets comes from the fact that like all non-parametric techniques do not have a goodness of fit measure to change how good is the models and the residuals based diagnostics are not existence since it is difficult to define residuals in this set up.
In this implementation we do provide residuals by using the flexDist()
function from package gamlss.dist. This is based on the idea that by
knowing the quantiles of the distribution we can reconstruct non parametrically
the distribution itself and this is what flexDist()
is doing.
As a word of caution, such a construct is based on several assumptions and depends on
several smoothing parameters. Treat those residuals with caution.
The same caution should apply to the function z.scoresQS()
.
Schnabel, S.K. (2011) Expectile smoothing: new perspectives on asymmetric least squares. An application to life expectancy, Utrecht University.
Schnabel, S. K and Eilers, P. H. C.(2013) Simultaneous estimation of quantile curves using quantile sheets, AStA Advances in Statistical Analysis, 97, 1, pp 77-87, Springer.
Schnabel, S. K and Eilers, P. H. (2013) A location-scale model for non-crossing expectile curves, Stat, 2, 1, pp 171-183.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also http://www.gamlss.org/).
lms
: for a parametric equivalent results.
# NOT RUN {
data(abdom)
m1 <- quantSheets(y,x, data=abdom)
head(fitted(m1))
p1 <- predict(m1, newdata=c(20,30,40))
matpoints(c(20,30,40), p1)
z.scoresQS(m1,y=c(150, 300),x=c(20, 30) )
# If we needed a power transformation not appropriate for this data
findPower(y,x, data=abdom)
# }
Run the code above in your browser using DataLab