repRankAggreg
repeats rank aggregation of ordered validation measure lists
obtained from an object of class "'>optCluster"
. The
function returns an object of class "'>optCluster"
.
repRankAggreg(optObj, rankMethod = "same", distance = "same",
importance = NULL, rankVerbose = FALSE, ... )
A character string providing the type of distance to be used for measuring the similarity between ordered lists
in rank aggregation. As default, the "same" distance as the input "'>optCluster"
object is used.
The weighted Spearman footrule distance ("Spearman") or the weighted Kendall's tau distance ("Kendall")
can also be directly specified. Selection of only one distance is allowed.
Vector of weights indicating the importance of each validation measure list. Default of NULL represents equal weights to each validation measure. See Weighted Rank Aggregation in the `Details' section for more information.
If TRUE, current rank aggregation results are displayed at each iteration.
Additional arguments that can be passed to the internal function RankAggreg
:
maxIter
- The maximum number of iterations allowed. Default = 1000
k
- Size of top-k list in aggregation.
convIN
- Stopping criteria for CE and GA algorithms. The algorithm converges once the "best" solution does not
change after convIN iterations. Default: 7 for CE and 30 for GA.
N
- Number of samples generated by MCMC in the CE algorithm. Default = 10*k^2
rho
- For CE algorithm, (rho*N) is the qunatile of candidate list sorted by function values.
weight
- For CE algorithm, the learning factor used in the probability update feature. Default = 0.25
popSize
- For GA algorithm population size in each generation. Default = 100
CP
- For GA algorithm, the crossover probability. Default = 0.4
MP
- For GA algorithm, the mutation probability. Default = 0.01
repRankAggreg
returns an object of class "'>optCluster"
. The class description
is provided in the help file.
This function tests the consistency of the rank aggregation results by repeating rank aggregation with the same
rank aggregation method, distance measure, clustering algorithm lists, and validation score lists used to create
the input object of class "'>optCluster"
. A different rank aggregation algorithm or
type of distance measure can also be evaluated using this function, but doing so may affect the final results.
Weighted Rank Aggregation: A list of weights for each validation measure list
can be included using the importance
argument. The default value of equal weights (NULL) is
represented by rep(1, length(x)), where x is the character vector of validation measure names. This
means each validation measure list has a weight of 1/length(x).
To manually change the weights, the order of the validation measures selected needs to be known.
The order of validation measures used in optCluster
is provided below:
When selected, stability measures will ALWAYS be listed first and in the following order: "APN", "AD", "ADM", "FOM".
When selected, internal measures will only precede biological measures. The order of these measures is: "Connectivity", "Dunn", "Silhouette".
When selected, biological measures will always be listed last and in the following order: "BHI", "BSI".
Sekula, M., Datta, S., and Datta, S. (2017). optCluster: An R package for determining the optimal clustering algorithm. Bioinformation, 13(3), 101. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5450252
Pihur, V., Datta, S. and Datta, S. (2007). Weighted rank aggregation of cluster validation measures: A Mounte Carlo cross-entropy approach. Bioinformatics 23(13): 1607-1615.
Pihur, V., Datta, S. and Datta, S. (2009). RankAggreg, an R package for weighted rank aggregation. BMC Bioinformatics, 10:62, https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-62.
For a description of the RankAggreg
function, including all available arguments that can be
passed to it, see RankAggreg
in the RankAggreg package.
# NOT RUN {
## These examples may take a few minutes to compute
# }
# NOT RUN {
## Obtain Dataset
data(arabid)
## Normalize Data with Respect to Library Size
obj <- t(t(arabid)/colSums(arabid))
## Analysis of Normalized Data using Internal and Stability Validation Measures
norm1 <- optCluster(obj, 2:4, clMethods = "all")
print(norm1)
repCE <- repRankAggreg(norm1)
print(repCE)
repGA <- repRankAggreg(norm1, rankMethod = "GA")
print(repGA)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab