Learn R Programming

Hmisc (version 5.0-1)

estSeqMarkovOrd: estSeqMarkovOrd

Description

Simulate Comparisons For Use in Sequential Markov Longitudinal Clinical Trial Simulations

Usage

estSeqMarkovOrd(
  y,
  times,
  initial,
  absorb = NULL,
  intercepts,
  parameter,
  looks,
  g,
  formula,
  ppo = NULL,
  yprevfactor = TRUE,
  groupContrast = NULL,
  cscov = FALSE,
  timecriterion = NULL,
  coxzph = FALSE,
  sstat = NULL,
  rdsample = NULL,
  maxest = NULL,
  maxvest = NULL,
  nsim = 1,
  progress = FALSE,
  pfile = ""
)

Value

a data frame with number of rows equal to the product of nsim, the length of looks, and the length of parameter, with variables sim, parameter, look, est (log odds ratio for group), and vest (the variance of the latter). If timecriterion is specified the data frame also contains loghr (Cox log hazard ratio for group), lrchisq (chi-square from Cox test for group), and if coxph=TRUE, phchisq, the chi-square for testing proportional hazards. The attribute etimefreq is also present if timecriterion is present, and it probvides the frequency distribution of derived event times by group and censoring/event indicator. If sstat is given, the attribute sstat is also present, and it contains an array with dimensions corresponding to simulations, parameter values within simulations, id, and a two-column subarray with columns group and y, the latter being the summary measure computed by the sstat function. The returned data frame also has attribute lrmcoef which are the last-look logistic regression coefficient estimates over the nsim simulations and the parameter settings, and an attribute failures which is a data frame containing the variables reason and frequency cataloging the reasons for unsuccessful model fits.

Arguments

y

vector of possible y values in order (numeric, character, factor)

times

vector of measurement times

initial

a vector of probabilities summing to 1.0 that specifies the frequency distribution of initial values to be sampled from. The vector must have names that correspond to values of y representing non-absorbing states.

absorb

vector of absorbing states, a subset of y. The default is no absorbing states. Observations are truncated when an absorbing state is simulated. May be numeric, character, or factor.

intercepts

vector of intercepts in the proportional odds model. There must be one fewer of these than the length of y.

parameter

vector of true parameter (effects; group differences) values. These are group 2:1 log odds ratios in the transition model, conditioning on the previous y.

looks

integer vector of ID numbers at which maximum likelihood estimates and their estimated variances are computed. For a single look specify a scalar value for loops equal to the number of subjects in the sample.

g

a user-specified function of three or more arguments which in order are yprev - the value of y at the previous time, the current time t, the gap between the previous time and the current time, an optional (usually named) covariate vector X, and optional arguments such as a regression coefficient value to simulate from. The function needs to allow yprev to be a vector and yprev must not include any absorbing states. The g function returns the linear predictor for the proportional odds model aside from intercepts. The returned value must be a matrix with row names taken from yprev. If the model is a proportional odds model, the returned value must be one column. If it is a partial proportional odds model, the value must have one column for each distinct value of the response variable Y after the first one, with the levels of Y used as optional column names. So columns correspond to intercepts. The different columns are used for y-specific contributions to the linear predictor (aside from intercepts) for a partial or constrained partial proportional odds model. Parameters for partial proportional odds effects may be included in the ... arguments.

formula

a formula object given to the lrm() function using variables with these name: y, time, yprev, and group (factor variable having values '1' and '2'). The yprev variable is converted to a factor before fitting the model unless yprevfactor=FALSE.

ppo

a formula specifying the part of formula for which proportional odds is not to be assumed, i.e., that specifies a partial proportional odds model. Specifying ppo triggers the use of VGAM::vglm() instead of rms::lrm and will make the simulations run slower.

yprevfactor

see formula

groupContrast

omit this argument if group has only one regression coefficient in formula. Otherwise if ppo is omitted, provide groupContrast as a list of two lists that are passed to rms::contrast.rms() to compute the contrast of interest and its standard error. The first list corresponds to group 1, the second to group 2, to get a 2:1 contrast. If ppo is given and the group effect is not just a simple regression coefficient, specify as groupContrast a function of a vglm fit that computes the contrast of interest and its standard error and returns a list with elements named Contrast and SE. For the latter type you can optionally have formal arguments n1, n2, and parameter that are passed to groupContrast to compute the standard error of the group contrast, where n1 and n2 respectively are the sample sizes for the two groups and parameter is the true group effect parameter value.

cscov

applies if ppo is not used. Set to TRUE to use the cluster sandwich covariance estimator of the variance of the group comparison.

timecriterion

a function of a time-ordered vector of simulated ordinal responses y that returns a vector FALSE or TRUE values denoting whether the current y level met the condition of interest. For example estSeqMarkovOrd will compute the first time at which y >= 5 if you specify timecriterion=function(y) y >= 5. This function is only called at the last data look for each simulated study. To have more control, instead of timecriterion returning a logical vector have it return a numeric 2-vector containing, in order, the event/censoring time and the 1/0 event/censoring indicator.

coxzph

set to TRUE if timecriterion is specified and you want to compute a statistic for testing proportional hazards at the last look of each simulated data

sstat

set to a function of the time vector and the corresponding vector of ordinal responses for a single group if you want to compute a Wilcoxon test on a derived quantity such as the number of days in a given state.

rdsample

an optional function to do response-dependent sampling. It is a function of these arguments, which are vectors that stop at any absorbing state: times (ascending measurement times for one subject), y (vector of ordinal outcomes at these times for one subject. The function returns NULL if no observations are to be dropped, returns the vector of new times to sample.

maxest

maximum acceptable absolute value of the contrast estimate, ignored if NULL. Any values exceeding maxest will result in the estimate being set to NA.

maxvest

like maxest but for the estimated variance of the contrast estimate

nsim

number of simulations (default is 1)

progress

set to TRUE to send current iteration number to pfile every 10 iterations. Each iteration will really involve multiple simulations, if parameter has length greater than 1.

pfile

file to which to write progress information. Defaults to '' which is the console. Ignored if progress=FALSE.

Author

Frank Harrell

Details

Simulates sequential clinical trials of longitudinal ordinal outcomes using a first-order Markov model. Looks are done sequentially after subject ID numbers given in the vector looks with the earliest possible look being after subject 2. At each look, a subject's repeated records are either all used or all ignored depending on the sequent ID number. For each true effect parameter value, simulation, and at each look, runs a function to compute the estimate of the parameter of interest along with its variance. For each simulation, data are first simulated for the last look, and these data are sequentially revealed for earlier looks. The user provides a function g that has extra arguments specifying the true effect of parameter the treatment group expecting treatments to be coded 1 and 2. parameter is usually on the scale of a regression coefficient, e.g., a log odds ratio. Fitting is done using the rms::lrm() function, unless non-proportional odds is allowed in which case VGAM::vglm() is used. If timecriterion is specified, the function also, for the last data look only, computes the first time at which the criterion is satisfied for the subject or use the event time and event/censoring indicator computed by timecriterion. The Cox/logrank chi-square statistic for comparing groups on the derived time variable is saved. If coxzph=TRUE, the survival package correlation coefficient rho from the scaled partial residuals is also saved so that the user can later determine to what extent the Markov model resulted in the proportional hazards assumption being violated when analyzing on the time scale. vglm is accelerated by saving the first successful fit for the largest sample size and using its coefficients as starting value for further vglm fits for any sample size for the same setting of parameter.

See Also

gbayesSeqSim(), simMarkovOrd(), https://hbiostat.org/R/Hmisc/markov/