Takes the dataset and metafile output of nhppSimConstWindowGen
and of SegSeq, then evaluates the performance in change-point precision and recall. The dataset must be generated in such format for this function to work.
nhppSimConstWindowAnalysis(filePrefix, chromosomeN,
distMetric=c(20,50,100,150,200,300,500,1000),
cptLen=c(3,5,8,12,15,20,30,50,100),
nPair=2, nRepeat=10, statistic="normal", grid.size="auto", takeN=5,
maxNCut=60, minStat=5, verbose=FALSE, timing=TRUE, hasRun=FALSE,
width=12, height=6)
The first part of the filename for data and metafile generated by nhppSimConstWindowGen
The number indicating the chromosome number the dataset emulates
A set of criterions of determining change points called are true. A call is deemed true if an actual signal change points within x number of reads is matched to it, after a minimum-cost bipartite matching. Larger value is a looser criterion.
The second part of the filename for data and metafile generated by nhppSimConstWindowGen
, indicating the length of the true signal. Constant width of the signal (CN gain or loss) region to simulate, can be a vector of different values for which to test
A part of the filename for data and metafile generated by nhppSimConstWindowGen
, indicating the number of normal/tumor pair. Number of tumor samples to generate for each choice of the width of the signal; number of normal samples to generate
A part of the filename for data and metafile generated by nhppSimConstWindowGen
. Number of times to repeat the simulation data generation
The type of statistic to use for the analysis
Argument to ScanCBS
Argument to ScanCBS
Argument to ScanCBS
Argument to ScanCBS
If TRUE
, will print run information as the algorithm proceeds
Performs timing of the ScanCBS
algorithm
Width of the graph output file
Height of the graph output file
Result of ScanCBS
output structure
The distance among reads after minimum-cost bipartite graph matching for our algorithm
The distance among reads after minimum-cost bipartite graph matching for SegSeq
The recall rates of two algorithms
The precision rates of two algorithms
The F-measure of two algorithms
The mean distance between true signal boundaries
The number of true change points
Number of change points called by the two algorithms
Mean computational time of ScanCBS
for each signal length
This function is used in conjunction with nhppSimConstWindowGen
. It reads in the data and metafile output of the said function, and compares the performance of our algorithm with SegSeq. It is important that SegSeq has been used on the simulation datasets generated before using this.