Learn R Programming

seriation (version 1.5.6)

seriate_best: Best Seriation

Description

Often the best seriation method for a particular dataset is not know and heuristics may produce unstable results. seriate_best() and seriate_rep() automatically try different seriation methods or rerun randomized methods several times to find the best and order given a criterion measure. seriate_improve() uses a local improvement strategy to imporve an existing solution.

Usage

seriate_best(
  x,
  methods = NULL,
  control = NULL,
  criterion = NULL,
  rep = 10L,
  parallel = TRUE,
  verbose = TRUE,
  ...
)

seriate_rep( x, method = NULL, control = NULL, criterion = NULL, rep = 10L, parallel = TRUE, verbose = TRUE, ... )

seriate_improve( x, order, criterion = NULL, control = NULL, verbose = TRUE, ... )

Value

Returns an object of class ser_permutation.

Arguments

x

the data.

methods

a vector of character string with the name of the seriation methods to try.

control

a list of control options passed on to seriate(). For seriate_best() control needs to be a named list of control lists with the names matching the seriation methods.

criterion

seriate_rep() chooses the criterion specified for the method in the registry. A character string with the criterion to optimize can be specified.

rep

number of times to repeat the randomized seriation algorithm.

parallel

logical; perform replications in parallel. Uses foreach::foreach() if a %dopar% backend (e.g., doParallel::doParallel) is registered.

verbose

logical; show progress and results for different methods

...

further arguments are passed on to the seriate().

method

a character string with the name of the seriation method (default: varies by data type).

order

a ser_permutation object for x or the name of a seriation method to start with.

Author

Michael Hahsler

Details

seriate_rep() rerun a randomized seriation methods to find the best solution given the criterion specified for the method in the registry. A specific criterion can also be specified. Non-stochastic methods are automatically only run once.

seriate_best() runs a set of methods and returns the best result given a criterion. Stochastic methods are automatically randomly restarted several times.

seriate_improve() improves a seriation order using simulated annealing using a specified criterion measure. It uses seriate() with method "GSA", a reduced probability to accept bad moves, and a lower minimum temperature. Control parameters for this method are accepted.

Criterion

If no criterion is specified, ten the criterion specified for the method in the registry (see [get_seriation_method()]) is used. For methods with no criterion in the registry (marked as "other"), a default method is used. The defaults are:

  • dist: "AR_deviations" - the study in Hahsler (2007) has shown that this criterion has high similarity with most other criteria.

  • matrix: "Moore_stress"

Parallel Execution

Some methods support for parallel execution is provided using the foreach package. To use parallel execution, a suitable backend needs to be registered (see the Examples section for using the doParallel backend).

References

Hahsler, M. (2017): An experimental comparison of seriation methods for one-mode two-way data. European Journal of Operational Research, 257, 133--143. tools:::Rd_expr_doi("10.1016/j.ejor.2016.08.066")

See Also

Other seriation: register_DendSer(), register_GA(), register_optics(), register_smacof(), register_tsne(), register_umap(), registry_for_seriaiton_methods, seriate()

Examples

Run this code
data(SupremeCourt)
d_supreme <- as.dist(SupremeCourt)

# find best seriation order (tries by by default several fast methods)
o <- seriate_best(d_supreme, criterion = "AR_events")
o
pimage(d_supreme, o)

# run a randomized algorithms several times. It automatically chooses the
# LS criterion. Repetition information is returned as attributes
o <- seriate_rep(d_supreme, "QAP_LS", rep = 5)

attr(o, "criterion")
hist(attr(o, "criterion_distribution"))
pimage(d_supreme, o)

if (FALSE) {
# Using parallel execution on a larger dataset
data(iris)
m_iris <- as.matrix(iris[sample(seq(nrow(iris))),-5])
d_iris <- dist(m_iris)

library(doParallel)
registerDoParallel(cores = detectCores() - 1L)

# seriate rows of the iris data set
o <- seriate_best(d_iris, criterion = "LS")
o

pimage(d_iris, o)

# improve the order to minimize RGAR instead of LS
o_improved <- seriate_improve(d_iris, o, criterion = "RGAR")
pimage(d_iris, o_improved)

# available control parameters for seriate_improve()
get_seriation_method(name = "GSA")
}

Run the code above in your browser using DataLab