This function compares two models based on the subset of forecasts for which
both models have made a prediction. It gets called
from pairwise_comparison_one_group()
, which handles the
comparison of multiple models on a single set of forecasts (there are no
subsets of forecasts to be distinguished). pairwise_comparison_one_group()
in turn gets called from from pairwise_comparison()
which can handle
pairwise comparisons for a set of forecasts with multiple subsets, e.g.
pairwise comparisons for one set of forecasts, but done separately for two
different forecast targets.
compare_two_models(
scores,
name_model1,
name_model2,
metric,
one_sided = FALSE,
test_type = c("non_parametric", "permutation"),
n_permutations = 999
)
A data.table of scores as produced by score()
.
character, name of the first model
character, name of the model to compare against
A character vector of length one with the metric to do the
comparison on. The default is "auto", meaning that either "interval_score",
"crps", or "brier_score" will be selected where available.
See available_metrics()
for available metrics.
Boolean, default is FALSE
, whether two conduct a one-sided
instead of a two-sided test to determine significance in a pairwise
comparison.
character, either "non_parametric" (the default) or "permutation". This determines which kind of test shall be conducted to determine p-values.
numeric, the number of permutations for a permutation test. Default is 999.
Johannes Bracher, johannes.bracher@kit.edu
Nikos Bosse nikosbosse@gmail.com