tdmTuneIt: Tuning and unbiased evaluation (single tuning).

Description

For the first configuration name .conf in tdm$runList call the first tuning algorithm in tdm$tuneMethod (via function tdmDispatchTuner). After tuning perform with the best parameters a run of tdm$unbiasedFunc (usually unbiasedRun). This experiment is repeated tdm$nExperim times.

Usage

tdmTuneIt(envT, dataObj)

Arguments

envT

an environment containing on input at least the element tdm (a list with general settings for TDMR, see tdmDefaultsFill), which has at least the elements

tdm$runList: list of configuration names .conf
tdm$tuneMethod: the tuner

dataObj

object of class TDMdata (constructed here with the help of tdmReadAndSplit).

Value

environment envT, containing the results

res

data frame with results from last tuning (one line for each call of tdmStart*)

bst

data frame with the best-so-far results from last tuning (one line collected after each (SPO) step)

resGrid

list with data frames res from all tuning runs. Use envT$getRes(envT,confFile,nExp,theTuner) to retrieve a specific res.

bstGrid

list with data frames bst from all tuning runs. Use envT$getBst(envT,confFile,nExp,theTuner) to retrieve a specific bst.

theFinals

data frame with one line for each triple (confFile,nExp,tuner), each line contains summary information about the tuning run in the form: confFile tuner nExp [params] NRUN NEVAL RGain.bst RGain.* sdR.* where [params] is written depending on tdm$withParams. NRUN is the number of unbiased evaluation runs. NEVAL is the number of function evaluations (model builds) during tuning. RGain denotes the relative gain on a certain data set: the actual gain achieved with the model divided by the maximum gain possible for the current cost matrix and the current data set. This is for classification tasks, in the case of regression each RGain.* is replaced by RMAE.*, the relative mean absolute error. Each 'sdR.' denotes the standard deviation of the preceeding RGain or RMAE. RGain.bst is the best result during tuning obtained on the training-validation data. RGain.avg is the average result during tuning. The following pairs RGain.* sdR.* are the results of one or several unbiased evaluations on the test data where '*' takes as many values as there are elements in tdm$umode (the possible values are explained in unbiasedRun).

result

object of class TDMclassifier or TDMregressor. This is a list with results from tdm$mainFunc as called in the last unbiased evaluation using the best parameters found during tuning. Use print(envT$result) to get more info on such an object of class TDMclassifier.

tunerVal

an object with the return value from the last tuning process. For every tuner, this is the list spotConfig, containing the SPOT settings plus the TDMR settings in elements opts and tdm. Every tuner extends this list by tunerVal$alg.currentResult and tunerVal$alg.currentBest, see tdmDispatchTuner. In addition, each tuning method might add specific elements to the list, see the description of each tuner.

Environment envT contains further elements, but they are only relevant for the internal operation of tdmBigLoop and its subfunctions.

Details

tdmTuneIt differs from tdmBigLoop in that it processes only one configuration .conf and that it has dataObj as a mandatory calling parameter. This simplifies the data flow and is thus less error-prone.

tdm refers to envT$tdm.

See Details in tdmBigLoop for the list of avaialble tuners.

Examples

Run this code

# NOT RUN {
#*# This demo shows a complete tuned data mining process (level 3 of TDMR) where 
#*# the data mining task is the classification task SONAR (from UCI repository, 
#*# http://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+%28Sonar,+Mines+vs.+Rocks%29).
#*# The data mining process is in main_sonar.r, which calls tdmClassifyLoop and tdmClassify
#*# with Random Forest as the prediction model. 
#*# The three parameter to be tuned are CUTOFF1, CLASSWT2 and XPERC, as specified 
#*# in file sonar_04.roi. The tuner used here is LHD.  
#*# Tuning runs are rather short, to make the example run quickly. 
#*# Do not expect good numeric results. 
#*# See demo/demo03sonar_B.r for a somewhat longer tuning run, with two tuners SPOT and LHD.

  ## path is the dir with data and main_*.r file:
  path <- paste(find.package("TDMR"), "demo02sonar",sep="/");
  #path <- paste("../../inst", "demo02sonar",sep="/");

  ## control settings for TDMR
  tdm <- list( mainFunc="main_sonar"
             , umode="CV"              # { "CV" | "RSUB" | "TST" | "SP_T" }
             , tuneMethod = c("lhd")
             , filenameEnvT="exBigLoop.RData"   # file to save environment envT
             , nrun=1, nfold=2         # repeats and CV-folds for the unbiased runs
             , nExperim=1
             , optsVerbosity = 0       # the verbosity for the unbiased runs
             );
  source(paste(path,"main_sonar.r",sep="/"));    # main_sonar, readTrnSonar
# }
# NOT RUN {
  #*# This demo is for example and help (more meaningful, a bit higher budget)
  source(paste(path,"control_sonar.r",sep="/"));       # controlDM, controlSC
# }
# NOT RUN {
  ctrlSC <- controlSC();
  ctrlSC$opts <- controlDM();

  #
  # perform a complete tuning + unbiased eval
  # 
  envT <- tdmEnvTMakeNew(tdm,sCList=list(ctrlSC)); # construct envT from settings in tdm and ctrlSC
  dataObj <- tdmReadTaskData(envT,envT$tdm);
  envT <- tdmTuneIt(envT,dataObj=dataObj);       # start the tuning loop 
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab