Learn R Programming

FRESA.CAD (version 2.0.2)

updateNeRIModel: Update the NeRI-based model using new data or new threshold values

Description

This function will take the frequency-ranked set of variables and will generate a new model with terms that meet the net residual improvement (NeRI) threshold criteria.

Usage

updateNeRIModel(Outcome, 
	                covariates = "1", 
	                pvalue = c(0.05, 0.02),
	                VarFrequencyTable, 
	                variableList, 
	                data, 
	                type = c("LM", "LOGIT", "COX"),
	                testType=c("Binomial", "Wilcox", "tStudent"), 
	                lastTopVariable = 0, 
	                timeOutcome = "Time",
	                interaction = 1,
	                maxTrainModelSize = 0)

Arguments

Outcome
The name of the column in data that stores the variable to be predicted by the model
covariates
A string of the type "1 + var1 + var2" that defines which variables will always be included in the models (as covariates)
pvalue
The maximum p-value, associated to the NeRI, allowed for a term in the model
VarFrequencyTable
An array with the ranked frequencies of the features, (e.g. the ranked.var value returned by the NeRIBasedFRESA.Model function)
variableList
A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables
data
A data frame where all variables are stored in different columns
type
Fit type: Logistic ("LOGIT"), linear ("LM"), or Cox proportional hazards ("COX")
testType
Type of non-parametric test to be evaluated by the improvedResiduals function: Binomial test ("Binomial"), Wilcoxon rank-sum test ("Wilcox"), Student's t-test ("tStudent"), or F-test ("Ftest")
lastTopVariable
The maximum number of variables to be tested
timeOutcome
The name of the column in data that stores the time to event (needed only for a Cox proportional hazards regression model fitting)
interaction
Set to either 1 for first order models, or to 2 for second order models
maxTrainModelSize
Maximum number of terms that can be included in the model

Value

  • final.modelAn object of class lm, glm, or coxph containing the final model
  • var.namesA vector with the names of the features that were included in the final model
  • formulaAn object of class formula with the formula used to fit the final model
  • z.NeRIA vector in which each element represents the z-score of the NeRI, associated to the testType, for each feature found in the final model
  • loopsThe number of loops it took for the model to stabilize

See Also

updateModel

Examples

Run this code
# Start the graphics device driver to save all plots in a pdf format
	pdf(file = "Example.pdf")
	# Get the stage C prostate cancer data from the rpart package
	library(rpart)
	data(stagec)
	# Split the stages into several columns
	dataCancer <- cbind(stagec[,c(1:3,5:6)],
	                    gleason4 = 1*(stagec[,7] == 4),
	                    gleason5 = 1*(stagec[,7] == 5),
	                    gleason6 = 1*(stagec[,7] == 6),
	                    gleason7 = 1*(stagec[,7] == 7),
	                    gleason8 = 1*(stagec[,7] == 8),
	                    gleason910 = 1*(stagec[,7] >= 9),
	                    eet = 1*(stagec[,4] == 2),
	                    diploid = 1*(stagec[,8] == "diploid"),
	                    tetraploid = 1*(stagec[,8] == "tetraploid"),
	                    notAneuploid = 1-1*(stagec[,8] == "aneuploid"))
	# Remove the incomplete cases
	dataCancer <- dataCancer[complete.cases(dataCancer),]
	# Load a pre-stablished data frame with the names and descriptions of all variables
	
	data(cancerVarNames)
	
	# Rank the variables:
	# - Analyzing the raw data
	# - Using a Cox proportional hazards fitting
	# - According to the NeRI
	rankedDataCancer <- univariateRankVariables(variableList = cancerVarNames,
	                                            formula = "Surv(pgtime, pgstat) ~ 1",
	                                            Outcome = "pgstat",
	                                            data = dataCancer,
	                                            categorizationType = "Raw",
	                                            type = "COX",
	                                            rankingTest = "NeRI",
	                                            description = "Description")
	# Get a Cox proportional hazards model using:
	# - 10 bootstrap loops
	# - The ranked variables
	# - The Wilcoxon rank-sum test as the feature inclusion criterion
	cancerModel <- NeRIBasedFRESA.Model(loops = 10,
	                                    Outcome = "pgstat",
	                                    variableList = rankedDataCancer,
	                                    data = dataCancer,
	                                    type = "COX",
	                                    testType= "Wilcox",
	                                    timeOutcome = "pgtime")
	# Update the model, adding first order interactions
	
	uCancerModel <- updateNeRIModel(Outcome = "pgstat",
	        VarFrequencyTable = cancerModel$ranked.var,
	        variableList = cancerVarNames,
	        data = dataCancer,
	        type = "COX",
	        testType = "Wilcox",
	        timeOutcome = "pgtime",
	        interaction = 2)
	# Shut down the graphics device driver
	dev.off()

Run the code above in your browser using DataLab