Learn R Programming

FRESA.CAD (version 2.0.2)

getVarReclassification: Analysis of the effect of each term of a binary classification model by analyzing its reclassification performance

Description

This function provides an analysis of the effect of each model term by comparing the binary classification performance between the full model and the model without each term. The model is fitted using the train data set, but probabilities are predicted for the train and test data sets. Reclassification improvement is evaluated using the improveProb function (Hmisc package). Additionally, the integrated discrimination improvement (IDI) and the net reclassification improvement (NRI) of each model term are reported.

Usage

getVarReclassification(object,
	                       data,
	                       Outcome = "Class", 
	                       type = c("LOGIT", "LM", "COX"),
	                       testData = NULL)

Arguments

object
An object of class lm, glm, or coxph containing the model to be analyzed
data
A data frame where all variables are stored in different columns
Outcome
The name of the column in data that stores the variable to be predicted by the model
type
Fit type: Logistic ("LOGIT"), linear ("LM"), or Cox proportional hazards ("COX")
testData
A data frame similar to data, but with a data set to be independently tested. If NULL, data will be used.

Value

  • z.IDIsA vector in which each term represents the z-score of the IDI obtained with the full model and the model without one term
  • z.NRIsA vector in which each term represents the z-score of the NRI obtained with the full model and the model without one term
  • IDIsA vector in which each term represents the IDI obtained with the full model and the model without one term
  • NRIsA vector in which each term represents the NRI obtained with the full model and the model without one term
  • testData.z.IDIsA vector similar to z.IDIs, where values were estimated in testdata
  • testData.z.NRIsA vector similar to z.NRIs, where values were estimated in testdata
  • testData.IDIsA vector similar to IDIs, where values were estimated in testdata
  • testData.NRIsA vector similar to NRIs, where values were estimated in testdata

References

Pencina, M. J., D'Agostino, R. B., & Vasan, R. S. (2008). Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Statistics in medicine 27(2), 157-172.

See Also

getVarNeRI

Examples

Run this code
# Start the graphics device driver to save all plots in a pdf format
	pdf(file = "Example.pdf")
	# Get the stage C prostate cancer data from the rpart package
	library(rpart)
	data(stagec)
	# Split the stages into several columns
	dataCancer <- cbind(stagec[,c(1:3,5:6)],
	                    gleason4 = 1*(stagec[,7] == 4),
	                    gleason5 = 1*(stagec[,7] == 5),
	                    gleason6 = 1*(stagec[,7] == 6),
	                    gleason7 = 1*(stagec[,7] == 7),
	                    gleason8 = 1*(stagec[,7] == 8),
	                    gleason910 = 1*(stagec[,7] >= 9),
	                    eet = 1*(stagec[,4] == 2),
	                    diploid = 1*(stagec[,8] == "diploid"),
	                    tetraploid = 1*(stagec[,8] == "tetraploid"),
	                    notAneuploid = 1-1*(stagec[,8] == "aneuploid"))
	# Remove the incomplete cases
	dataCancer <- dataCancer[complete.cases(dataCancer),]
	# Load a pre-stablished data frame with the names and descriptions of all variables
	data(cancerVarNames)
	# Split the data set into train and test samples
	trainDataCancer <- dataCancer[1:(nrow(dataCancer)/2),]
	testDataCancer <- dataCancer[(nrow(dataCancer)/2+1):nrow(dataCancer),]
	# Get a Cox proportional hazards model using:
	# - 10 bootstrap loops
	# - Train data
	# - Age as a covariate
	# - zIDI as the feature inclusion criterion
	cancerModel <- ReclassificationFRESA.Model(loops = 10,
	                                           covariates = "1 + age",
	                                           Outcome = "pgstat",
	                                           variableList = cancerVarNames,
	                                           data = trainDataCancer,
	                                           type = "COX",
	                                           timeOutcome = "pgtime",
	                                           selectionType = "zIDI")
	# Get the IDI and NRI of each model term in the train data 
	# set and in the independent data set
	cancerModelRec <- getVarReclassification(object = cancerModel$final.model,
	                                         data = trainDataCancer,
	                                         Outcome = "pgstat",
	                                         type = "COX",
	                                         testData = testDataCancer)
	# Shut down the graphics device driver
	dev.off()

Run the code above in your browser using DataLab