nma.predrank: Predictive P-Score for Treatment Ranking in Bayesian Network Meta-Analysis

Description

Calculates the P-score and predictive P-score for a network meta-analysis in the Bayesian framework described in Rosenberger et al. (2021).

Usage

nma.predrank(sid, tid, r, n, data, n.adapt = 1000, n.chains = 3, n.burnin = 2000,
             n.iter = 5000, n.thin = 2, lowerbetter = TRUE, pred = TRUE,
             pred.samples = FALSE, trace = FALSE)

Value

This function estimates the P-score for all treatments in a Bayesian NMA, estimates the predictive P-score (if pred = TRUE), gives the posterior samples of expected scaled ranks in a new study (if pred.samples = TRUE), and outputs all MCMC posterior samples (if trace = TRUE).

Arguments

sid: a vector specifying the study IDs, from 1 to the number of studies.
tid: a vector specifying the treatment IDs, from 1 to the number of treatments.
r: a numeric vector specifying the event counts.
n: a numeric vector specifying the sample sizes.
data: an optional data frame containing the network meta-analysis dataset. If data is specified, the previous arguments, sid, tid, r, and n, should be specified as their corresponding column names in data.
n.adapt: the number of iterations for adaptation in the Markov chain Monte Carlo (MCMC) algorithm. The default is 1,000. This argument and the following n.chains, n.burnin, n.iter, and n.thin are passed to the functions in the package rjags.
n.chains: the number of MCMC chains. The default is 3.
n.burnin: the number of iterations for burn-in period. The default is 2,000.
n.iter: the total number of iterations in each MCMC chain after the burn-in period. The default is 5,000.
n.thin: a positive integer specifying thinning rate. The default is 2.
lowerbetter: A logical value indicating whether lower effect measures indicate better treaetments. If lowerbetter = TRUE (the default), then lower effect measures indicate better treatments.
pred: a logical value indicating whether the treatment ranking measures in a new study are to be derived. These measures are only derived when pred = TRUE.
pred.samples: a logical value indicating whether the posterior samples of expected scaled ranks in a new study are to be saved.
trace: a logical value indicating whether all posterior samples are to be saved.

Author

Kristine J. Rosenberger, Lifeng Lin

Details

Under the frequentist setting, the P-score is built on the quantiles $$P_{kh} = \Phi((\hat{d}_{1k} - \hat{d}_{1h})/s_{kh}),$$ where $\hat{d}_{1k}$ and $\hat{d}_{1h}$ are the point estimates of treatment effects for $k$ vs. 1 and $h$ vs. 1, respectively, and $s_{kh}$ is the standard error of $\hat{d}_{1k} - \hat{d}_{1h}$ (Rucker and Schwarzer, 2015). Moreover, $\Phi(\cdot)$ is the cumulative distribution function of the standard normal distribution. The quantity $P_{kh}$ can be interpreted as the extent of certainty that treatment $k$ is better than $h$. The frequentist P-score of treatment $k$ is $\frac{1}{K-1} \sum_{h \neq k} P_{kh}$.

Analogously to the frequentist P-score, conditional on $d_{1h}$ and $d_{1k}$, the quantities $P_{hk}$ from the Bayesian perspective can be considered as $I(d_{1k} > d_{1h})$, which are Bernoulli random variables. To quantify the overall performance of treatment $k$, we may similarly use $$\bar{P}_k = \frac{1}{K-1} \sum_{h \neq k} I(d_{1k} > d_{1h}).$$ Note that $\bar{P}_k$ is a parameter under the Bayesian framework, while the frequentist P-score is a statistic. Moreover, $\sum_{h \neq k} I(d_{1k} > d_{1h})$ is equivalent to $K - R_k$, where $R_k$ is the true rank of treatment $k$. Thus, we may also write $\bar{P}_k = (K - R_k)/(K - 1)$; this corresponds to the findings by Rucker and Schwarzer (2015). Consequently, we call $\bar{P}_k$ the scaled rank in the network meta-analysis (NMA) for treatment $k$. It transforms the range of the original rank between 1 and $K$ to a range between 0 and 1. In addition, note that $E[I(d_{1k} > d_{1h} | Data)] = \Pr(d_{1k} > d_{1h} | Data)$, which is analogous to the quantity of $P_{kh}$ under the frequentist framework. Therefore, we use the posterior mean of the scaled rank $\bar{P}_k$ as the Bayesian P-score; it is a counterpart of the frequentist P-score.

The scaled ranks $\bar{P}_k$ can be feasibly estimated via the MCMC algorithm. Let $\{d_{1k}^{(j)}; k = 2, \ldots, K\}_{j=1}^J$ be the posterior samples of the overall relative effects $d_{1k}$ of all treatments vs. the reference treatment 1 in a total of $J$ MCMC iterations after the burn-in period, where $j$ indexes the iterations. As $d_{11}$ is trivially 0, we set $d_{11}^{(j)}$ to 0 for all $j$. The $j$th posterior sample of treatment $k$'s scaled rank is $\bar{P}_k^{(j)} = \frac{1}{K-1} \sum_{h \neq k}I(d_{1k}^{(j)} > d_{1h}^{(j)})$. We can make inferences for the scaled ranks from the posterior samples $\{\bar{P}_k^{(j)}\}_{j=1}^{J}$, and use their posterior means as the Bayesian P-scores. We may also obtain the posterior medians as another set of point estimates, and the 2.5% and 97.5% posterior quantiles as the lower and upper bounds of 95% credible intervals (CrIs), respectively. Because the posterior samples of the scaled ranks take discrete values, the posterior medians and the CrI bounds are also discrete.

Based on the idea of the Bayesian P-score, we can similarly define the predictive P-score for a future study by accounting for the heterogeneity between the existing studies in the NMA and the new study. Specifically, we consider the probabilities in the new study $$P_{new,kh} = \Pr(\delta_{new,1k} > \delta_{new,1h}),$$ conditional on the population parameters $d_{1h}$, $d_{1k}$, and $\tau$ from the NMA. Here, $\delta_{new,1k}$ and $\delta_{new,1h}$ represent the treatment effects of $k$ vs. 1 and $h$ vs. 1 in the new study, respectively. The $P_{new,kh}$ corresponds to the quantity $P_{kh}$ in the NMA; it represents the probability of treatment $k$ being better than $h$ in the new study. Due to heterogeneity, $\delta_{new,1k} \sim N(d_{1k}, \tau^2)$ and $\delta_{new,1h} \sim N(d_{1h}, \tau^{2})$. The correlation coefficients between treatment comparisons are typically assumed to be 0.5; therefore, such probabilities in the new study can be explicitly calculated as $P_{new,kh} = \Phi((d_{1k} - d_{1h})/\tau)$, which is a function of $d_{1h}$, $d_{1k}$, and $\tau$. Finally, we use $$\bar{P}_{new,k} = \frac{1}{K-1} \sum_{h \neq k} P_{new,kh}$$ to quantify the performance of treatment $k$ in the new study. The posterior samples of $\bar{P}_{new,k}$ can be derived from the posterior samples of $d_{1k}$, $d_{1h}$, and $\tau$ during the MCMC algorithm.

Note that the probabilities $P_{new,kh}$ can be written as $E[I(\delta_{new,1k} > \delta_{new,1h})]$. Based on similar observations for the scaled ranks in the NMA, the $\bar{P}_{new,k}$ in the new study subsequently becomes $$\bar{P}_{new,k} = \frac{1}{K-1} E\left[\sum_{h \neq k} I(\delta_{new,1k} > \delta_{new,1h})\right] = E\left[\frac{K - R_{new,k}}{K - 1}\right],$$ where $R_{new,k}$ is the true rank of treatment $k$ in the new study. Thus, we call $\bar{P}_{new,k}$ the expected scaled rank in the new study. Like the Bayesian P-score, we define the predictive P-score as the posterior mean of $\bar{P}_{new,k}$. The posterior medians and 95% CrIs can also be obtained using the MCMC samples of $\bar{P}_{new,k}$. See more details in Rosenberger et al. (2021).

References

Rosenberger KJ, Duan R, Chen Y, Lin L (2021). "Predictive P-score for treatment ranking in Bayesian network meta-analysis." BMC Medical Research Methodology, 21, 213. <tools:::Rd_expr_doi("10.1186/s12874-021-01397-5")>

Rucker G, Schwarzer G (2015). "Ranking treatments in frequentist network meta-analysis works without resampling methods." BMC Medical Research Methodology, 15, 58. <tools:::Rd_expr_doi("10.1186/s12874-015-0060-8")>

Examples

Run this code

# \donttest{
## increase n.burnin (e.g., to 50000) and n.iter (e.g., to 200000)
## for better convergence of MCMC
data("dat.sc")
set.seed(1234)
out1 <- nma.predrank(sid, tid, r, n, data = dat.sc, n.burnin = 500, n.iter = 2000,
  lowerbetter = FALSE, pred.samples = TRUE)
out1$P.score
out1$P.score.pred

cols <- c("red4", "plum4", "paleturquoise4", "palegreen4")
cols.hist <- adjustcolor(cols, alpha.f = 0.4)
trtnames <- c("1) No contact", "2) Self-help", "3) Individual counseling",
  "4) Group counseling")
brks <- seq(0, 1, 0.01)
hist(out1$P.pred[[1]], breaks = brks, freq = FALSE,
  xlim = c(0, 1), ylim = c(0, 5), col = cols.hist[1], border = cols[1],
  xlab = "Expected scaled rank in a new study", ylab = "Density", main = "")
hist(out1$P.pred[[2]], breaks = brks, freq = FALSE,
  col = cols.hist[2], border = cols[2], add = TRUE)
hist(out1$P.pred[[3]], breaks = brks, freq = FALSE,
  col = cols.hist[3], border = cols[3], add = TRUE)
hist(out1$P.pred[[4]], breaks = brks, freq = FALSE,
  col = cols.hist[4], border = cols[4], add = TRUE)
legend("topright", fill = cols.hist, border = cols, legend = trtnames)


data("dat.xu")
set.seed(1234)
out2 <- nma.predrank(sid, tid, r, n, data = dat.xu, n.burnin = 500, n.iter = 2000)
out2
# }

Run the code above in your browser using DataLab