SLM: Simple linear regression model and multicollinearity

Description

The function analyzes the presence of near worrying multicollinearity in the Simple Linear Model (SLM).

Usage

SLM(X, dummy = FALSE)

Value

If dummy=TRUE:

Prop: Proportion of ones in the dummy variable.
CN: Condition Number of X.

If dummy=FALSE:

CV: Coeficient of variation of the second variable in X.
VIF: Variance Inflation Factor.
CN: Condition Number of X.
ki: Stewart's index of X.

Arguments

X: A numeric design matrix that should contain two independent variables (intercept included).
dummy: A logical value that indicates if there are dummy variables in the design matrix X. By default dummy=FALSE.

Author

R. Salmerón (romansg@ugr.es) and C. García (cbgarcia@ugr.es).

Details

The analysis of the presence of near worrying multicolllinearity in the SLM has been systematically ignored in some existing statistical softwares. However, it is possible to find worrying non essential multicollinearity in the SLM. In this case, the linear relation will be given by a second variable of X with very little variablity. For this reason, the coeficient of variation is calculated when the variable is quantitative and the proportion of ones if the variable is non-quantitative.

References

R. Salmerón, C. B. García and J. García (2018). Variance Inflation Factor and Condition Number in multiple linear regression. Journal of Statistical Computation and Simulation, 88 (12), 2365-2384.

L. R. Klein and A.S. Goldberger (1964). An economic model of the United States, 1929-1952. North Holland Publishing Company, Amsterdan.

H. Theil (1971). Principles of Econometrics. John Wiley & Sons, New York.

Examples

Run this code

# Henri Theil's textile consumption data modified
data(theil)
head(theil)
cte = array(1,length(theil[,2]))
theil.X = cbind(cte,theil[,-(1:2)])
SLM(theil.X, TRUE)

# Klein and Goldberger data on consumption and wage income
data(KG)
head(KG)
cte = array(1,length(KG[,1]))
KG.X = cbind(cte,KG[,-1])
SLM(KG.X)

# random
x1 = array(1,25)
x2 = sample(1:50,25)
x = cbind(x1,x2)
head(x)
SLM(x)

# random
x1 = array(1,25)
x2 = rnorm(25,100,1)
x = cbind(x1,x2)
head(x)
SLM(x)

# random
x1 = array(1,25)
x2 = sample(cbind(array(1,25),array(0,25)),25)
x = cbind(x1,x2)
head(x)
SLM(x, TRUE)

Run the code above in your browser using DataLab