Learn R Programming

scoringutils (version 1.2.2)

summarise_scores: Summarise scores as produced by score()


Summarise scores as produced by score()


  by = NULL,
  across = NULL,
  fun = mean,
  relative_skill = FALSE,
  relative_skill_metric = "auto",
  metric = deprecated(),
  baseline = NULL,

summarize_scores( scores, by = NULL, across = NULL, fun = mean, relative_skill = FALSE, relative_skill_metric = "auto", metric = deprecated(), baseline = NULL, ... )



A data.table of scores as produced by score().


character vector with column names to summarise scores by. Default is NULL, meaning that the only summary that takes is place is summarising over samples or quantiles (in case of quantile-based forecasts), such that there is one score per forecast as defined by the unit of a single forecast (rather than one score for every sample or quantile). The unit of a single forecast is determined by the columns present in the input data that do not correspond to a metric produced by score(), which indicate indicate a grouping of forecasts (for example there may be one forecast per day, location and model). Adding additional, unrelated, columns may alter results in an unpredictable way.


character vector with column names from the vector of variables that define the unit of a single forecast (see above) to summarise scores across (meaning that the specified columns will be dropped). This is an alternative to specifying by directly. If NULL (default), then by will be used or inferred internally if also not specified. Only one of across and by may be used at a time.


a function used for summarising scores. Default is mean.


logical, whether or not to compute relative performance between models based on pairwise comparisons. If TRUE (default is FALSE), then a column called 'model' must be present in the input data. For more information on the computation of relative skill, see pairwise_comparison(). Relative skill will be calculated for the aggregation level specified in by.


character with the name of the metric for which a relative skill shall be computed. If equal to 'auto' (the default), then this will be either interval score, CRPS or Brier score (depending on which of these is available in the input data)


[Deprecated] Deprecated in 1.1.0. Use relative_skill_metric instead.


character string with the name of a model. If a baseline is given, then a scaled relative skill with respect to the baseline will be returned. By default (NULL), relative skill will not be scaled with respect to a baseline model.


additional parameters that can be passed to the summary function provided to fun. For more information see the documentation of the respective function.


Run this code
# \dontshow{
  data.table::setDTthreads(2) # restricts number of cores used on CRAN
# }
library(magrittr) # pipe operator

scores <- score(example_continuous)

# summarise over samples or quantiles to get one score per forecast
scores <- score(example_quantile)

# get scores by model
summarise_scores(scores,by = "model")

# get scores by model and target type
summarise_scores(scores, by = c("model", "target_type"))

# Get scores summarised across horizon, forecast date, and target end date
 scores, across = c("horizon", "forecast_date", "target_end_date")

# get standard deviation
summarise_scores(scores, by = "model", fun = sd)

# round digits
summarise_scores(scores,by = "model") %>%
  summarise_scores(fun = signif, digits = 2)

# get quantiles of scores
# make sure to aggregate over ranges first
  by = "model", fun = quantile,
  probs = c(0.25, 0.5, 0.75)

# get ranges
# summarise_scores(scores, by = "range")

Run the code above in your browser using DataLab