Extract root mean squared error of a fitted `pk` object
# S3 method for pk
rmse(
obj,
newdata = NULL,
model = NULL,
method = NULL,
exclude = TRUE,
use_scale_conc = FALSE,
rmse_group = NULL,
sub_pLOQ = TRUE,
suppress.messages = NULL,
...
)
A `data.frame` with calculated RMSE as the final column. There is one row per each model in `obj`'s [stat_model()] element, i.e. each PK model that was fitted to the data, each [optimx::optimx()] methods (specified in [settings_optimx()]), `rmse_group` specified.
A `pk` object
Optional: A `data.frame` with new data for which to make predictions and compute RMSEs. If NULL (the default), then RMSEs will be computed for the data in `obj$data`. `newdata` is required to contain at least the following variables: `Time`, `Time.Units`, `Dose`, `Route`, `Media`, `Conc`, `Conc_SD`, `N_Subjects`, `Detect`.
Optional: Specify one or more of the fitted models for which to make predictions and calculate RMSEs. If NULL (the default), RMSEs will be returned for all of the models in `obj$stat_model`.
Optional: Specify one or more of the [optimx::optimx()] methods for which to make predictions and calculate RMSEs. If NULL (the default), RMSEs will be returned for all of the models in `obj$optimx_settings$method`.
Logical: `TRUE` to compute the RMSE excluding any observations in the data marked for exclusion (if there is a variable `exclude` in the data, an observation is marked for exclusion when `exclude `FALSE` to include all observations, regardless of exclusion status. Default `TRUE`.
Possible values: `FALSE` (default, `TRUE`, or a named list with elements `dose_norm` and `log10_trans` which themselves should be either `TRUE` or `FALSE`. If `use_scale_conc = FALSE` (the default for this function), then no concentration scaling or transformation will be applied when the RMSE is computed. If `use_scale_conc = TRUE, then the concentration scaling/transformations in `obj` will be applied to both predicted and observed concentrations when the RMSE is computed (see [calc_rmse()] for details).If `use_scale_conc = list(dose_norm = ..., log10_trans = ...)`, then the specified dose normalization and/or log10-transformation will be applied when the RMSE is computed.
A list of quosures provided in the format `vars(...)` that determines the data groupings for which RMSE is calculated. Default NULL, in which case RMSE is calculated for each data group defined in the object's `data_group` element (use [get_data_group.pk()] to access the object's `data_group`).
TRUE (default): Substitute all predictions below the LOQ with the LOQ before computing R-squared. FALSE: do not.
Logical: whether to suppress message printing. If NULL (default), uses the setting in `obj$settings_preprocess$suppress.messages`
Additional arguments. Not currently used.
Caroline Ring, Gilberto Padilla Mercado
# Formula for RMSE
RMSE is calculated using the following formula, to properly handle summary data:
$$ \sqrt{ \frac{1}{N} \sum_{i=1}^G \left( (n_i - 1) s_i^2 + n_i \bar{y}_i ^2 - 2 n_i \bar{y}_i \mu_i + \mu_i^2 \right) } $$
In this formula, there are \(G\) observations, each of which may be for one subject or for multiple subjects.
- \(n_i\) is the number of subjects for observation \(i\). - \(\bar{y}_i\) is the sample mean concentration for observation \(i\), with no transformations applied. - \(s_i\) is the sample standard deviation of concentrations for observation \(i\), with no transformations applied. - \(\mu_i\) is the model-predicted concentration for observation \(i\), with no transformations applied.
\(N\) is the grand total of subjects across observations:
$$N = \sum_{i=1}^G n_i$$
For the non-summary case (\(N\) single-subject observations, with all \(n_i = 1\), \(s_i = 0\), and \(\bar{y}_i = y_i\)), this formula reduces to the familiar RMSE formula
$$\sqrt{\frac{1}{N} \sum_{i=1}^N (y_i - \mu_i)^2}$$
# Left-censored data
If the observed value is censored, and the predicted value is less than the reported LOQ, then the predicted value is (temporarily) set equal to the LOQ, for an effective error of zero.
If the observed value is censored, and the predicted value is greater than the reported LOQ, the the observed value is treated as the reported LOQ (so that the effective error is the difference between the LOQ and the predicted value).
# Log10 transformation
If `log10_trans log10-transformed before calculating the RMSE. In the case where observed values are reported in summary format, each sample mean and sample SD (reported on the natural scale, i.e. the mean and SD of natural-scale individual observations) are used to produce an estimate of the log10-scale sample mean and sample SD (i.e., the mean and SD of log10-transformed individual observations), using [convert_summary_to_log10()].
The formulas are as follows. Again, \(\bar{y}_i\) is the sample mean for group \(i\). \(s_i\) is the sample standard deviation for group \(i\).
$$\textrm{log10-scale sample mean}_i = \log_{10} \left(\frac{\bar{y}_i^2}{\sqrt{\bar{y}_i^2 + s_i^2}} \right)$$
$$\textrm{log10-scale sample SD}_i = \sqrt{\log_{10} \left(1 + \frac{s_i^2}{\bar{y}_i^2} \right)}$$
[calc_rmse()]
Other fit evaluation metrics:
AAFE.pk()
,
AFE.pk()
,
AIC.pk()
,
BIC.pk()
,
logLik.pk()
,
rsq.pk()
Other methods for fitted pk objects:
AAFE.pk()
,
AFE.pk()
,
AIC.pk()
,
BIC.pk()
,
coef.pk()
,
coef_sd.pk()
,
eval_tkstats.pk()
,
get_fit.pk()
,
get_hessian.pk()
,
get_tkstats.pk()
,
logLik.pk()
,
predict.pk()
,
residuals.pk()
,
rsq.pk()