consensusCalculation: Calculation of a (single) consenus with optional data calibration.

Description

This function calculates a single consensus from given individual data, optionally first calibrating the individual data to make them comparable.

Usage

consensusCalculation(
  individualData,
  consensusOptions,
  useBlocks = NULL,
  randomSeed = NULL,
  saveCalibratedIndividualData = FALSE,
  calibratedIndividualDataFilePattern = "calibratedIndividualData-%a-Set%s-Block%b.RData",
  # Return options: the data can be either saved or returned but not both.
  saveConsensusData = NULL,
  consensusDataFileNames = "consensusData-%a-Block%b.RData",
  getCalibrationSamples= FALSE,
  # Internal handling of data
  useDiskCache = NULL, chunkSize = NULL,
  cacheDir = ".",
  cacheBase = ".blockConsModsCache",
  # Behaviour
  collectGarbage = FALSE,
  verbose = 1, indent = 0)

Arguments

individualData

Individual data from which the consensus is to be calculated. It can be either a list or a multiData structure. Each element in individulData can in turn either be a numeric obeject (vector, matrix or array) or a BlockwiseData structure.

consensusOptions

A list of class ConsensusOptions that contains options for the consensus calculation. A suitable list can be obtained by calling function newConsensusOptions.

useBlocks

When individualData contains BlockwiseData, this argument can be an integer vector with indices of blocks for which the calculation should be performed.

randomSeed

If non-NULL, the function will save the current state of the random generator, set the given seed, and restore the random seed to its original state upon exit. If NULL, the seed is not set nor is it restored on exit.

saveCalibratedIndividualData

Logical: should calibrated individual data be saved?

calibratedIndividualDataFilePattern

Pattern from which file names for saving calibrated individual data are determined. The conversions %a, %s and %b will be replaced by analysis name, set number and block number, respectively.

saveConsensusData

Logical: should final consensus be saved (TRUE) or returned in the return value (FALSE)? If NULL, data will be saved only if input data were blockwise data saved on disk rather than held in memory

consensusDataFileNames

Pattern from which file names for saving the final consensus are determined. The conversions %a and %b will be replaced by analysis name and block number, respectively.

getCalibrationSamples

When calibration method in the consensusOptions component of ConsensusTree is "single quantile", this logical argument determines whether the calibration samples should be retuned within the return value.

useDiskCache

Logical: should disk cache be used for consensus calculations? The disk cache can be used to sture chunks of calibrated data that are small enough to fit one chunk from each set into memory (blocks may be small enough to fit one block of one set into memory, but not small enogh to fit one block from all sets in a consensus calculation into memory at the same time). Using disk cache is slower but lessens the memry footprint of the calculation. As a general guide, if individual data are split into blocks, we recommend setting this argument to TRUE. If this argument is NULL, the function will decide whether to use disk cache based on the number of sets and block sizes.

chunkSize

Integer giving the chunk size. If left NULL, a suitable size will be chosen automatically.

cacheDir

Directory in which to save cache files. The files are deleted on normal exit but persist if the function terminates abnormally.

cacheBase

Base for the file names of cache files.

collectGarbage

Logical: should garbage collection be forced after each major calculation?

verbose

Integer level of verbosity of diagnostic messages. Zero means silent, higher values make the output progressively more and more verbose.

indent

Indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces.

Value

A list with the following components:

consensusData

A BlockwiseData list containing the consensus.

nSets

Number of input data sets.

saveCalibratedIndividualData

Copy of the input saveCalibratedIndividualData.

calibratedIndividualData

If input saveCalibratedIndividualData is TRUE, a list in which each component is a BlockwiseData structure containing the calibrated individual data for the corresponding input individual data set.

calibrationSamples

If consensusOptions$calibration is "single quantile" and getCalibrationSamples is TRUE, a list in which eahc component contains the calibration samples for the corresponding input individual data set.

originCountA vector of length nSets that contains, for each set, the number of (calibrated) elements that were less than or equal the consensus for that element.

Details

Consensus is defined as the element-wise (also known as "parallel") quantile of the individual data at probability given by the consensusQuantile element of consensusOptions. Depending on the value of component calibration of consensusOptions, the individual data are first calibrated. For consensusOptions$calibration="full quantile", the individual data are quantile normalized using normalize.quantiles. For consensusOptions$calibration="single quantile", the individual data are raised to a power such that the quantiles at probability consensusOptions$calibrationQuantile are the same. For consensusOptions$calibration="none", the individual data are not calibrated.

References