Evaluate forecasts in a Quantile-Based Format
score_quantile(
data,
forecast_unit,
metrics,
weigh = TRUE,
count_median_twice = FALSE,
separate_results = TRUE
)
A data.table with appropriate scores. For more information see
score()
A data.frame or data.table with the predictions and observations.
For scoring using score()
, the following columns need to be present:
true_value
- the true observed values
prediction
- predictions or predictive samples for one
true value. (You only don't need to provide a prediction column if
you want to score quantile forecasts in a wide range format.)
For scoring integer and continuous forecasts a sample
column is needed:
sample
- an index to identify the predictive samples in the
prediction column generated by one model for one true value. Only
necessary for continuous and integer forecasts, not for
binary predictions.
For scoring predictions in a quantile-format forecast you should provide
a column called quantile
:
quantile
: quantile to which the prediction corresponds
In addition a model
column is suggested and if not present this will be
flagged and added to the input data with all forecasts assigned as an
"unspecified model").
You can check the format of your data using check_forecasts()
and there
are examples for each format (example_quantile, example_continuous,
example_integer, and example_binary).
A character vector with the column names that define
the unit of a single forecast, i.e. a forecast was made for a combination
of the values in forecast_unit
the metrics you want to have in the output. If NULL
(the
default), all available metrics will be computed. For a list of available
metrics see available_metrics()
, or check the metrics data set.
if TRUE, weigh the score by alpha / 2, so it can be averaged
into an interval score that, in the limit, corresponds to CRPS. Alpha is the
decimal value that represents how much is outside a central prediction
interval (e.g. for a 90 percent central prediction interval, alpha is 0.1)
Default: TRUE
.
logical that controls whether or not to count the
median twice when summarising (default is FALSE
). Counting the
median twice would conceptually treat it as a 0\
the median is the lower as well as the upper bound. The alternative is to
treat the median as a single quantile forecast instead of an interval. The
interval score would then be better understood as an average of quantile
scores.
if TRUE
(default is FALSE
), then the separate
parts of the interval score (dispersion penalty, penalties for over- and
under-prediction get returned as separate elements of a list). If you want a
data.frame
instead, simply call as.data.frame()
on the output.
Nikos Bosse nikosbosse@gmail.com
Funk S, Camacho A, Kucharski AJ, Lowe R, Eggo RM, Edmunds WJ (2019) Assessing the performance of real-time epidemic forecasts: A case study of Ebola in the Western Area region of Sierra Leone, 2014-15. PLoS Comput Biol 15(2): e1006785. tools:::Rd_expr_doi("10.1371/journal.pcbi.1006785")