Runs Coarse Approximation Linear Function on a random subset of the data provided, resulting in the same proportion applied to case and control, when applicable.
calf_subset(
data,
nMarkers,
proportion = 0.8,
targetVector,
times = 1,
optimize = "pval",
verbose = FALSE
)
Matrix or data frame. First column must contain case/control dummy coded variable (if targetVector = "binary"). Otherwise, first column must contain real number vector corresponding to selection variable (if targetVector = "nonbinary"). All other columns contain relevant markers.
Maximum number of markers to include in creation of sum.
Numeric. A value between 0 and 1 indicating the proportion of cases and controls to use in analysis (if targetVector = "binary"). If targetVector = "nonbinary", this is just a proportion of the full sample. Used to evaluate robustness of solution. Defaults to 0.8.
Indicate "binary" for target vector with two options (e.g., case/control). Indicate "nonbinary" for target vector with real numbers.
Numeric. Indicates the number of replications to run with randomization.
Criteria to optimize if targetVector = "binary." Indicate "pval" to optimize the p-value corresponding to the t-test distinguishing case and control. Indicate "auc" to optimize the AUC.
Logical. Indicate TRUE to print activity at each iteration to console. Defaults to FALSE.
A data frame containing the chosen markers and their assigned weight (-1 or 1)
The optimal AUC, pval, or correlation for the classification. If multiple replications are requested, a data.frame containing all optimized values across all replications is returned.
aucHist A histogram of the AUCs across replications, if applicable.
# NOT RUN {
calf_subset(data = CaseControl, nMarkers = 6, targetVector = "binary", times = 5)
# }
Run the code above in your browser using DataLab