Learn R Programming

npreg (version 1.1.0)

varinf: Variance Inflation Factors

Description

Computes variance inflation factors for terms of a smooth model.

Usage

varinf(object, newdata = NULL)

Value

a named vector containing the variance inflation factors for each effect function (in object$terms).

Arguments

object

an object of class "sm" output by the sm function or an object of class "gsm" output by the gsm function.

newdata

the data used for variance inflation calculation (if NULL training data are used).

Author

Nathaniel E. Helwig <helwig@umn.edu>

Details

Let \(\kappa_j^2\) denote the VIF for the \(j\)-th model term.

Values of \(\kappa_j^2\) close to 1 indicate no multicollinearity issues for the \(j\)-th term. Larger values of \(\kappa_j^2\) indicate that \(\eta_j\) has more collinearity with other terms.

Thresholds of \(\kappa_j^2 > 5\) or \(\kappa_j^2 > 10\) are typically recommended for determining if multicollinearity is too much of an issue.

To understand these thresholds, note that $$\kappa_j^2 = \frac{1}{1 - R_j^2}$$ where \(R_j^2\) is the R-squared for the linear model predicting \(\eta_j\) from the remaining model terms.

References

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. tools:::Rd_expr_doi("10.1007/978-1-4614-5369-7")

Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. tools:::Rd_expr_doi("10.4135/9781526421036885885")

See Also

See summary.sm for more thorough summaries of smooth models.

See summary.gsm for more thorough summaries of generalized smooth models.

Examples

Run this code
##########   EXAMPLE 1   ##########
### 4 continuous predictors
### no multicollinearity

# generate data
set.seed(1)
n <- 100
fun <- function(x){
  sin(pi*x[,1]) + sin(2*pi*x[,2]) + sin(3*pi*x[,3]) + sin(4*pi*x[,4])
}
data <- as.data.frame(replicate(4, runif(n)))
colnames(data) <- c("x1v", "x2v", "x3v", "x4v")
fx <- fun(data)
y <- fx + rnorm(n)

# fit model
mod <- sm(y ~ x1v + x2v + x3v + x4v, data = data, tprk = FALSE)

# check vif
varinf(mod)


##########   EXAMPLE 2   ##########
### 4 continuous predictors
### multicollinearity

# generate data
set.seed(1)
n <- 100
fun <- function(x){
  sin(pi*x[,1]) + sin(2*pi*x[,2]) + sin(3*pi*x[,3]) + sin(3*pi*x[,4])
}
data <- as.data.frame(replicate(3, runif(n)))
data <- cbind(data, c(data[1,2], data[2:n,3]))
colnames(data) <- c("x1v", "x2v", "x3v", "x4v")
fx <- fun(data)
y <- fx + rnorm(n)

# check collinearity
cor(data)
cor(sin(3*pi*data[,3]), sin(3*pi*data[,4]))

# fit model
mod <- sm(y ~ x1v + x2v + x3v + x4v, data = data, tprk = FALSE)

# check vif
varinf(mod)

Run the code above in your browser using DataLab