Learn R Programming

MXM (version 0.8.7)

bic.glm.fsreg: Variable selection in generalised linear regression models with forward selection

Description

Variable selection in generalised linear regression models with forward selection

Usage

bic.glm.fsreg( target, dataset, robust = FALSE, tol = 0, ncores = 1 )

Arguments

target
The class variable. It can be either a vector with binary data (binomial regression), counts (poisson regression). If none of these is identified, linear regression is used.
dataset
The dataset; provide either a data frame or a matrix (columns = variables, rows = samples). These can be continous and or categorical.
robust
A boolean variable which indicates whether (TRUE) or not (FALSE) to use a robust version of the statistical test if it is available. It takes more time than a non robust version but it is suggested in case of outliers. Default value is FALSE and is curren
tol
The difference bewtween two successive values of BIC. By default this is is set to 2. If for example, the BIC difference between two succesive models is less than 2, the process stops and the last variable, even though significant does not enter the model
ncores
How many cores to use. This plays an important role if you have tens of thousands of variables or really large sample sizes and tens of thousands of variables and a regression based test which requires numerical optimisation. In other cammmb it will not m

Value

  • The output of the algorithm is S3 object including:
  • matA matrix with the variables and their latest test statistics and p-values.
  • infoA matrix with the selected variables, their p-values and test statistics. Each row corresponds to a model which contains the variables up to that line. The BIC in the last column is the BIC of that model.
  • modelsThe regression models, one at each step.
  • finalThe final regression model.
  • runtimeThe run time of the algorithm. A numeric vector. The first element is the user time, the second element is the system time and the third element is the elapsed time.

Details

Forward selection via the BIC is implemented. A variable which results in a reduction of BIC will be included, until the reduction is less than the specified (by the user) value can be achieved.

See Also

fs.reg, lm.fsreg, bic.fsreg, CondIndTests, MMPC, SES

Examples

Run this code
set.seed(123)
#require(gRbase) #for faster computations in the internal functions
require(hash)

#simulate a dataset with continuous data
dataset <- matrix( runif(1000 * 50, 1, 100), ncol = 50 )

#define a simulated class variable 
target <- 3 * dataset[, 10] + 2 * dataset[, 20] + 3 * dataset[, 30] + rnorm(1000, 0, 5)
a <- bic.glm.fsreg(target, dataset, robust = FALSE, tol = 2, ncores = 1 )

Run the code above in your browser using DataLab