Learn R Programming

acepack (version 1.6.1)

avas: Additivity and variance stabilization for regression

Description

Estimate transformations of x and y such that the regression of y on x is approximately linear with constant variance

Usage

avas(...)

# S3 method for default avas( x, y, wt = NULL, cat = NULL, mon = NULL, lin = NULL, circ = NULL, delrsq = 0.01, yspan = 0, control = NULL, ... )

# S3 method for formula avas( formula, data = NULL, subset = NULL, na.action = getOption("na.action"), ... )

# S3 method for avas summary(object, ...)

# S3 method for avas print(x, ..., digits = 4)

# S3 method for avas plot( x, ..., which = 1:(x$p + 1), caption = c(list("Response Y AVAS Transformation"), as.list(paste("Carrier", rownames(x$x), "AVAS Transformation"))), xlab = "Original", ylab = "Transformed", ask = prod(par("mfcol")) < length(which) && dev.interactive() )

Value

A structure with the following components:

x

the input x matrix.

y

the input y vector.

tx

the transformed x values.

ty

the transformed y values.

rsq

the multiple R-squared value for the transformed values.

l

the codes for cat, mon, ...

m

not used in this version of avas

yspan

span used for smoothing the variance

iters

iteration number and rsq for that iteration

niters

number of iterations used

Arguments

...

additional arguments which go ignored for avas call. Included for S3 dispatch consistency. They are utilized when using print as they get passed to cat. Also when plotting an ace object they are passed to plot.

x

matrix containing the independent variables.

y

a vector containing the response variable.

wt

an optional vector of weights.

cat

an optional integer vector specifying which variables assume categorical values. Positive values in cat refer to columns of the x matrix and zero to the response variable. Variables must be numeric, so a character variable should first be transformed with as.numeric() and then specified

mon

an optional integer vector specifying which variables are to be transformed by monotone transformations. Positive values in mon refer to columns of the x matrix and zero to the response variable.

lin

an optional integer vector specifying which variables are to be transformed by linear transformations. Positive values in lin refer to columns of the x matrix and zero to the response variable.

circ

an integer vector specifying which variables assume circular (periodic) values. Positive values in circ refer to columns of the x matrix and zero to the response variable.

delrsq

numeric(1); Termination threshold for iteration. Stops when R-squared changes by less than delrsq in 3 consecutive iterations (default 0.01).

yspan

yspan Optional window size parameter for smoothing the variance. Range is \([0,1]\). Default is 0 (cross validated choice). .5 is a reasonable alternative to try.

control

named list; control parameters to set. Documented at set_control.

formula

formula; an object of class "formula": a symbolic description of the model to be smoothed.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which ace is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process. Only used when a formula is specified.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

object

an S3 ace object

digits

rounding digits for summary/print

which

when plotting an ace object which plots to produce.

caption

a list of captions for a plot.

xlab

the x-axis label when plotting.

ylab

the y-axis label when plotting.

ask

when plotting should the terminal be asked for input between plots.

References

Rob Tibshirani (1987), ``Estimating optimal transformations for regression''. Journal of the American Statistical Association 83, 394ff.

Examples

Run this code

TWOPI <- 8*atan(1)
x <- runif(200,0,TWOPI)
y <- exp(sin(x)+rnorm(200)/2)
a <- avas(x,y)
plot(a) # View response and carrier transformations
plot(a$tx,a$ty) # examine the linearity of the fitted model

# From D. Wang and M. Murphy (2005), Identifying nonlinear relationships
# regression using the ACE algorithm.  Journal of Applied Statistics,
# 32, 243-258, adapted for avas.
X1 <- runif(100)*2-1
X2 <- runif(100)*2-1
X3 <- runif(100)*2-1
X4 <- runif(100)*2-1

# Original equation of Y:
Y <- log(4 + sin(3*X1) + abs(X2) + X3^2 + X4 + .1*rnorm(100))

# Transformed version so that Y, after transformation, is a
# linear function of transforms of the X variables:
# exp(Y) = 4 + sin(3*X1) + abs(X2) + X3^2 + X4

a1 <- avas(cbind(X1,X2,X3,X4),Y)

par(mfrow=c(2,1))

# For each variable, show its transform as a function of
# the original variable and the of the transform that created it,
# showing that the transform is recovered.
plot(X1,a1$tx[,1])
plot(sin(3*X1),a1$tx[,1])

plot(X2,a1$tx[,2])
plot(abs(X2),a1$tx[,2])

plot(X3,a1$tx[,3])
plot(X3^2,a1$tx[,3])

plot(X4,a1$tx[,4])
plot(X4,a1$tx[,4])

plot(Y,a1$ty)
plot(exp(Y),a1$ty)

Run the code above in your browser using DataLab