PocSimMIN: Pocock and Simon's Method in the Two-Arms Case

Description

Allocates patients to one of two treatments using Pocock and Simon's method proposed by Pocock S J, Simon R (1975) <doi:10.2307/2529712>.

Usage

PocSimMIN(data, weight = NULL, p = 0.85)

Value

It returns an object of class

"carandom".

An object of class "carandom" is a list containing the following components:

datanumeric: a bool indicating whether the data is a numeric data frame.
covariates: a character string giving the name(s) of the included covariates.
strt_num: the number of strata.
cov_num: the number of covariates.
level_num: a vector of level numbers for each covariate.
n: the number of patients.
Cov_Assig: a (cov_num + 1) * n matrix containing covariate profiles for all patients and the corresponding assignments. The $i$th column represents the $i$th patient. The first cov_num rows include patients' covariate profiles, and the last row contains the assignments.
assignments: the randomization sequence.
All strata: a matrix containing all strata involved.
Diff: a matrix with only one column. There are final differences at the overall, within-stratum, and within-covariate-margin levels.
method: a character string describing the randomization procedure to be used.
Data Type: a character string giving the data type, Real or Simulated.
weight: a vector giving the weights imposed on each covariate.
framework: the framework of the used randomization procedure: stratified randomization, or model-based method.
data: the data frame.

Arguments

data: a data frame. A row of the dataframe corresponds to the covariate profile of a patient.
weight: a vector of weights for within-covariate-margin imbalances. It is required that at least one element is larger than 0. If weight = NULL (default), the within-covariate-margin imbalances are weighted with an equal proportion, 1/cov_num, for each covariate-margin.
p: the biased coin probability. p should be larger than 1/2 and less than 1. The default is 0.85.

Details

Consider $I$ covariates and $m_i$ levels for the $i$th covariate, $i=1,\ldots,I$. $T_j$ is the assignment of the $j$th patient and $Z_j = (k_1,\dots,k_I)$ indicates the covariate profile of this patient, $j=1,\ldots,n$. For convenience, $(k_1,\dots,k_I)$ and $(i;k_i)$ denote the stratum and margin, respectively. $D_j(.)$ is the difference between the numbers of patients assigned to treatment $1$ and treatment $2$ at the corresponding levels after $j$ patients have been assigned. The Pocock and Simon's minimization procedure is as follows:

(1) The first patient is assigned to treatment $1$ with probability $1/2$;

(2) Suppose that $j-1$ patients have been assigned ($1<j\le n$) and the $j$th patient falls within $(k_1^*,\dots,k_I^*)$;

(3) If the $j$th patient were assigned to treatment $1$, then the potential within-covariate-margin differences between the two treatments would be $$D_j^{(1)}(i;k_i^*)=D_{j-1}(i,k_i^*)+1$$

for margin $(i;k_i^*)$. Similarly, the potential differences would be obtained in the same way if the $j$th patient were assigned to treatment $2$;

(4) An imbalance measure is defined by $$Imb_j^{(l)}=\sum_{i=1}^{I}\omega_{m,i}[D_j^{(l)}(i;k_i^*)]^2,l=1,2;$$

(5) Conditional on the assignments of the first ($j-1$) patients as well as the covariate profiles of the first $j$ patients, assign the $j$th patient to treatment $1$ with the probability $$P(T_j=1|Z_j,T_1,\dots,T_{j-1})=q$$ for $Imb_j^{(1)}>Imb_j^{(2)},$ $$P(T_j=1|Z_j,T_1,\dots,T_{j-1})=p$$ for $Imb_j^{(1)}<Imb_j^{(2)}$, and $$P(T_j=1|Z_j,T_1,\dots,T_{j-1})=0.5$$ for $Imb_j^{(1)}=Imb_j^{(2)}.$

Details of the procedure can be found in Pocock S J, Simon R (1975).

References

Ma W, Ye X, Tu F, Hu F. carat: Covariate-Adaptive Randomization for Clinical Trials[J]. Journal of Statistical Software, 2023, 107(2): 1-47.

Pocock S J, Simon R. Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial[J]. Biometrics, 1975: 103-115.

Examples

Run this code

# a simple use
## Real Data
## creat a dataframe
df <- data.frame("gender" = sample(c("female", "male"), 1000, TRUE, c(1 / 3, 2 / 3)), 
                 "age" = sample(c("0-30", "30-50", ">50"), 1000, TRUE), 
                 "jobs" = sample(c("stu.", "teac.", "others"), 1000, TRUE), 
                 stringsAsFactors = TRUE)
weight <- c(1, 2, 1)
Res <- PocSimMIN(data = df, weight)
## view the output
Res
# \donttest{
## view all patients' profile and assignments
Res$Cov_Assig# }

## Simulated Data
cov_num = 3
level_num = c(2, 3, 3)
pr = c(0.4, 0.6, 0.3, 0.3, 0.4, 0.4, 0.3, 0.3)
Res.sim <- PocSimMIN.sim(n = 1000, cov_num, level_num, pr)
## view the output
Res.sim
# \donttest{
## view the detials of difference
Res.sim$Diff# }

# \donttest{
N <- 5
n <- 1000
cov_num <- 3
level_num <- c(2, 3, 5) 
# Set pr to follow two tips:
# (1) length of pr should be sum(level_num);
# (2)sum of probabilities for each margin should be 1.
pr <- c(0.4, 0.6, 0.3, 0.4, 0.3, rep(0.2, times = 5))
omega <- c(0.2, 0.2, rep(0.6 / cov_num, times = cov_num))
weight <- c(2, rep(1, times = cov_num - 1))

## generate a container to contain Diff
DH <- matrix(NA, ncol = N, nrow = 1 + prod(level_num) + sum(level_num))
DP <- matrix(NA, ncol = N, nrow = 1 + prod(level_num) + sum(level_num))
for(i in 1 : N){
  result <- HuHuCAR.sim(n, cov_num, level_num, pr, omega)
  resultP <- PocSimMIN.sim(n, cov_num, level_num, pr, weight)
  DH[ , i] <- result$Diff; DP[ , i] <- resultP$Diff
}

## do some analysis
require(dplyr)

## analyze the overall imbalance
Ana_O <- matrix(NA, nrow = 2, ncol = 3)
rownames(Ana_O) <- c("NEW", "PS")
colnames(Ana_O) <- c("mean", "median", "95%quantile")
temp <- DH[1, ] %>% abs
tempP <- DP[1, ] %>% abs
Ana_O[1, ] <- c((temp %>% mean), (temp %>% median),
                (temp %>% quantile(0.95)))
Ana_O[2, ] <- c((tempP %>% mean), (tempP %>% median),
                (tempP %>% quantile(0.95)))

## analyze the within-stratum imbalances
tempW <- DH[2 : (1 + prod(level_num)), ] %>% abs
tempWP <- DP[2 : 1 + prod(level_num), ] %>% abs
Ana_W <- matrix(NA, nrow = 2, ncol = 3)
rownames(Ana_W) <- c("NEW", "PS")
colnames(Ana_W) <- c("mean", "median", "95%quantile")
Ana_W[1, ] = c((tempW %>% apply(1, mean) %>% mean),
               (tempW %>% apply(1, median) %>% mean),
               (tempW %>% apply(1, mean) %>% quantile(0.95)))
Ana_W[2, ] = c((tempWP %>% apply(1, mean) %>% mean),
               (tempWP %>% apply(1, median) %>% mean),
               (tempWP %>% apply(1, mean) %>% quantile(0.95)))

## analyze the marginal imbalance
tempM <- DH[(1 + prod(level_num) + 1) :
              (1 + prod(level_num) + sum(level_num)), ] %>% abs
tempMP <- DP[(1 + prod(level_num) + 1) :
               (1 + prod(level_num) + sum(level_num)), ] %>% abs
Ana_M <- matrix(NA, nrow = 2, ncol = 3)
rownames(Ana_M) <- c("NEW", "PS")
colnames(Ana_M) <- c("mean", "median", "95%quantile")
Ana_M[1, ] = c((tempM %>% apply(1, mean) %>% mean),
               (tempM %>% apply(1, median) %>% mean),
               (tempM %>% apply(1, mean) %>% quantile(0.95)))
Ana_M[2, ] = c((tempMP %>% apply(1, mean) %>% mean),
               (tempMP %>% apply(1, median) %>% mean),
               (tempMP %>% apply(1, mean) %>% quantile(0.95)))

AnaHP <- list(Ana_O, Ana_M, Ana_W)
names(AnaHP) <- c("Overall", "Marginal", "Within-stratum")

AnaHP
# }

Run the code above in your browser using DataLab