Learn R Programming

⚠️There's a newer version (0.6.0) of this package.Take me there.

olsrr

Overview

The olsrr package provides following tools for building OLS regression models using R:

  • Comprehensive Regression Output
  • Variable Selection Procedures
  • Heteroskedasticity Tests
  • Collinearity Diagnostics
  • Model Fit Assessment
  • Measures of Influence
  • Residual Diagnostics
  • Variable Contribution Assessment

Installation

# Install release version from CRAN
install.packages("olsrr")

# Install development version from GitHub
# install.packages("devtools")
devtools::install_github("rsquaredacademy/olsrr")

Articles

Usage

olsrr uses consistent prefix ols_ for easy tab completion.

olsrr is built with the aim of helping those users who are new to the R language. If you know how to write a formula or build models using lm, you will find olsrr very useful. Most of the functions use an object of class lm as input. So you just need to build a model using lm and then pass it onto the functions in olsrr. Below is a quick demo:

Regression

ols_regress(mpg ~ disp + hp + wt + qsec, data = mtcars)
#>                         Model Summary                          
#> --------------------------------------------------------------
#> R                       0.914       RMSE                2.622 
#> R-Squared               0.835       Coef. Var          13.051 
#> Adj. R-Squared          0.811       MSE                 6.875 
#> Pred R-Squared          0.771       MAE                 1.858 
#> --------------------------------------------------------------
#>  RMSE: Root Mean Square Error 
#>  MSE: Mean Square Error 
#>  MAE: Mean Absolute Error 
#> 
#>                                ANOVA                                 
#> --------------------------------------------------------------------
#>                 Sum of                                              
#>                Squares        DF    Mean Square      F         Sig. 
#> --------------------------------------------------------------------
#> Regression     940.412         4        235.103    34.195    0.0000 
#> Residual       185.635        27          6.875                     
#> Total         1126.047        31                                    
#> --------------------------------------------------------------------
#> 
#>                                   Parameter Estimates                                    
#> ----------------------------------------------------------------------------------------
#>       model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
#> ----------------------------------------------------------------------------------------
#> (Intercept)    27.330         8.639                  3.164    0.004     9.604    45.055 
#>        disp     0.003         0.011        0.055     0.248    0.806    -0.019     0.025 
#>          hp    -0.019         0.016       -0.212    -1.196    0.242    -0.051     0.013 
#>          wt    -4.609         1.266       -0.748    -3.641    0.001    -7.206    -2.012 
#>        qsec     0.544         0.466        0.161     1.166    0.254    -0.413     1.501 
#> ----------------------------------------------------------------------------------------

Stepwise Regression

Build regression model from a set of candidate predictor variables by entering and removing predictors based on p values, in a stepwise manner until there is no variable left to enter or remove any more.

Variable Selection

# stepwise regression
model <- lm(y ~ ., data = surgical)
ols_step_both_p(model)
#> 
#>                                 Stepwise Selection Summary                                 
#> ------------------------------------------------------------------------------------------
#>                         Added/                   Adj.                                         
#> Step     Variable      Removed     R-Square    R-Square     C(p)        AIC         RMSE      
#> ------------------------------------------------------------------------------------------
#>    1    liver_test     addition       0.455       0.444    62.5120    771.8753    296.2992    
#>    2     alc_heavy     addition       0.567       0.550    41.3680    761.4394    266.6484    
#>    3    enzyme_test    addition       0.659       0.639    24.3380    750.5089    238.9145    
#>    4      pindex       addition       0.750       0.730     7.5370    735.7146    206.5835    
#>    5        bcs        addition       0.781       0.758     3.1920    730.6204    195.4544    
#> ------------------------------------------------------------------------------------------

Stepwise AIC Backward Regression

Build regression model from a set of candidate predictor variables by removing predictors based on Akaike Information Criteria, in a stepwise manner until there is no variable left to remove any more.

Variable Selection
# stepwise aic backward regression
model <- lm(y ~ ., data = surgical)
k <- ols_step_backward_aic(model)
k
#> 
#> 
#>                         Backward Elimination Summary                         
#> ---------------------------------------------------------------------------
#> Variable        AIC          RSS          Sum Sq        R-Sq      Adj. R-Sq 
#> ---------------------------------------------------------------------------
#> Full Model    736.390    1825905.713    6543614.824    0.78184      0.74305 
#> alc_mod       734.407    1826477.828    6543042.709    0.78177      0.74856 
#> gender        732.494    1829435.617    6540084.920    0.78142      0.75351 
#> age           730.620    1833716.447    6535804.090    0.78091      0.75808 
#> ---------------------------------------------------------------------------

Breusch Pagan Test

Breusch Pagan test is used to test for herteroskedasticity (non-constant error variance). It tests whether the variance of the errors from a regression is dependent on the values of the independent variables. It is a (\chi^{2}) test.

model <- lm(mpg ~ disp + hp + wt + drat, data = mtcars)
ols_test_breusch_pagan(model)
#> 
#>  Breusch Pagan Test for Heteroskedasticity
#>  -----------------------------------------
#>  Ho: the variance is constant            
#>  Ha: the variance is not constant        
#> 
#>              Data               
#>  -------------------------------
#>  Response : mpg 
#>  Variables: fitted values of mpg 
#> 
#>        Test Summary         
#>  ---------------------------
#>  DF            =    1 
#>  Chi2          =    1.429672 
#>  Prob > Chi2   =    0.231818

Collinearity Diagnostics

model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_coll_diag(model)
#> Tolerance and Variance Inflation Factor
#> ---------------------------------------
#>   Variables Tolerance      VIF
#> 1      disp 0.1252279 7.985439
#> 2        hp 0.1935450 5.166758
#> 3        wt 0.1445726 6.916942
#> 4      qsec 0.3191708 3.133119
#> 
#> 
#> Eigenvalue and Condition Index
#> ------------------------------
#>    Eigenvalue Condition Index   intercept        disp          hp
#> 1 4.721487187        1.000000 0.000123237 0.001132468 0.001413094
#> 2 0.216562203        4.669260 0.002617424 0.036811051 0.027751289
#> 3 0.050416837        9.677242 0.001656551 0.120881424 0.392366164
#> 4 0.010104757       21.616057 0.025805998 0.777260487 0.059594623
#> 5 0.001429017       57.480524 0.969796790 0.063914571 0.518874831
#>             wt         qsec
#> 1 0.0005253393 0.0001277169
#> 2 0.0002096014 0.0046789491
#> 3 0.0377028008 0.0001952599
#> 4 0.7017528428 0.0024577686
#> 5 0.2598094157 0.9925403056

Getting Help

If you encounter a bug, please file a minimal reproducible example using reprex on github. For questions and clarifications, use StackOverflow.

Copy Link

Version

Install

install.packages('olsrr')

Monthly Downloads

15,116

Version

0.5.3

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

February 10th, 2020

Functions in olsrr (0.5.3)

ols_aic

Akaike information criterion
hsb

Test Data Set
ols_fpe

Final prediction error
ols_coll_diag

Collinearity diagnostics
ols_correlations

Part and partial correlations
ols_msep

MSEP
ols_leverage

Leverage
ols_apc

Amemiya's prediction criterion
fitness

Test Data Set
cement

Test Data Set
ols_plot_resid_fit

Residual vs fitted plot
ols_plot_reg_line

Simple linear regression line
ols_plot_comp_plus_resid

Residual plus component plot
ols_plot_resid_box

Residual box plot
ols_prep_outlier_obs

Cooks' D outlier observations
ols_mallows_cp

Mallow's Cp
rivers

Test Data Set
ols_plot_cooksd_bar

Cooks' D bar plot
ols_plot_cooksd_chart

Cooks' D chart
ols_prep_avplot_data

Added variable plot data
ols_plot_added_variable

Added variable plots
ols_plot_response

Response variable profile
ols_hsp

Hocking's Sp
ols_plot_resid_stand

Standardized residual chart
ols_launch_app

Launch shiny app
ols_plot_resid_stud

Studentized residual plot
ols_plot_dfbetas

DFBETAs panel
ols_plot_obs_fit

Observed vs fitted values plot
ols_plot_hadi

Hadi plot
ols_plot_dffits

DFFITS plot
ols_plot_diagnostics

Diagnostics panel
ols_plot_resid_regressor

Residual vs regressor plot
ols_plot_resid_hist

Residual histogram
ols_test_bartlett

Bartlett test
ols_prep_rfsplot_fmdata

Residual fit spread plot data
ols_pred_rsq

Predicted rsquare
ols_prep_regress_y

Regress y on other predictors
surgical

Surgical Unit Data Set
ols_plot_resid_fit_spread

Residual fit spread plot
ols_plot_resid_lev

Studentized residuals vs leverage plot
ols_regress

Ordinary least squares regression
ols_step_all_possible_betas

All possible regression variable coefficients
ols_prep_srchart_data

Standardized residual chart data
ols_prep_cdplot_data

Cooks' D plot data
ols_prep_cdplot_outliers

Cooks' d outlier data
ols_prep_regress_x

Regress predictor on other predictors
ols_plot_resid_qq

Residual QQ plot
ols_plot_resid_pot

Potential residual plot
ols_sbc

Bayesian information criterion
ols_test_breusch_pagan

Breusch pagan test
stepdata

Test Data Set
ols_step_backward_aic

Stepwise AIC backward regression
ols_step_backward_p

Stepwise backward regression
ols_step_both_aic

Stepwise AIC regression
ols_prep_dfbeta_data

DFBETAs plot data
ols_step_forward_aic

Stepwise AIC forward regression
ols_prep_rvsrplot_data

Residual vs regressor plot data
ols_step_forward_p

Stepwise forward regression
ols_plot_resid_stud_fit

Deleted studentized residual vs fitted values plot
ols_press

PRESS
ols_step_both_p

Stepwise regression
rvsr_plot_shiny

Residual vs regressors plot for shiny app
ols_prep_rstudlev_data

Studentized residual vs leverage plot data
olsrr

olsrr package
ols_pure_error_anova

Lack of fit F test
ols_sbic

Sawa's bayesian information criterion
ols_step_best_subset

Best subsets regression
ols_prep_srplot_data

Studentized residual plot data
ols_prep_dfbeta_outliers

DFBETAs plot outliers
ols_test_score

Score test
ols_step_all_possible

All possible regression
ols_test_outlier

Bonferroni Outlier Test
ols_test_correlation

Correlation test for normality
ols_prep_dsrvf_data

Deleted studentized residual plot data
ols_test_f

F test
ols_test_normality

Test for normality
ols_hadi

Hadi's influence measure
auto

Test Data Set