Learn R Programming

CALF (version 1.0.17)

cv.calf: cv.calf

Description

Performs cross-validation using CALF data input

Usage

cv.calf(
  data,
  limit,
  proportion = 0.8,
  times,
  targetVector,
  optimize = "pval",
  outputPath = NULL
)

Arguments

data

Matrix or data frame. First column must contain case/control dummy coded variable (if targetVector = "binary"). Otherwise, first column must contain real number vector corresponding to selection variable (if targetVector = "nonbinary"). All other columns contain relevant markers.

limit

Maximum number of markers to include in creation of sum.

proportion

Numeric. A value between 0 and 1 indicating the proportion of cases and controls to use in analysis (if targetVector = "binary") or proportion of the full sample (if targetVector = "nonbinary"). Defaults to 0.8.

times

Numeric. Indicates the number of replications to run with randomization.

targetVector

Indicate "binary" for target vector with two options (e.g., case/control). Indicate "nonbinary" for target vector with real numbers.

optimize

Criteria to optimize if targetVector = "binary." Indicate "pval" to optimize the p-value corresponding to the t-test distinguishing case and control. Indicate "auc" to optimize the AUC. Defaults to pval.

outputPath

The path where files are to be written as output, default is NULL meaning no files will be written. When targetVector is "binary" file binary.csv will be output in the provided path, showing the reults. When targetVector is "nonbinary" file nonbinary.csv will be output in the provided path, showing the results. In the same path, the kept and unkept variables from the last iteration, will be output, prefixed with the targetVector type "binary" or "nonbinary" followed by Kept and Unkept and suffixed with .csv. Two files containing the results from each run have List in the filenames and suffixed with .txt.

Value

A data frame containing "times" rows of CALF runs where each row represents a run of CALF on a randomized "proportion" of "data". Colunns start with the numer selected for the run, followed by AUC or pval and then all markers from "data". An entry in a marker column signifys a chosen marker for a particular run (a row) and their assigned coarse weight (-1, 0, or 1).

Examples

Run this code
# NOT RUN {
cv.calf(data = CaseControl, limit = 5, times = 100, targetVector = 'binary')
# }

Run the code above in your browser using DataLab