library(SimDesign)
library(SimNPH)
library(parallel)
cl <- makeCluster(8)
clusterEvalQ(cl, {
  library(SimDesign)
  library(SimNPH)
  library(parallel)
})This is a simple example on how to run a simulation of analyses of datasets with a delayed treatment effect using the max-combo and the log-rank test. The
Setting up the simulations to be run. createDesign
creates a tibble with every combination of the parameters.
Each line corresponds to one simulation to be run.
The function generate_delayed_effect needs the columns:
n_trt: number of patients in the treatment arm,
n_ctrl: number of patients in the control arm,
delay: delay until onset of treatment effect,
hazard_ctrl: hazard under control and before onset of
treatment effect, hazart_trt: hazard under treatment,
t_max: maximum time of observation.
An example design tibble with all parameters filled out
can be created with desing_skeleton_delayed_effect. Use the
function to output an example function call that you can copy and modify
as needed, or assign the result to a variable to obtain a design
tibble with some default parameters.
By default this will create a simulation design skeleton for simulations of 50 patients in each arm, a constant hazard of 0.2 under control and a hazard of 0.02 under treatment after the effect onset varying from 0 to 10.
N_sim <- 100
Assumptions <- assumptions_delayed_effect()
#> expand.grid(
#>   delay=m2d(seq(0, 10, by=2)), # delay of 0, 1, ..., 10 months
#>   hazard_ctrl=m2r(24),         # median survival control of 24 months
#>   hazard_trt=m2r(36),          # median survival treatment of 36 months
#>   random_withdrawal=m2r(120)   # median time to random withdrawal 10 years
#> )
Options <- design_fixed_followup()
#> expand.grid(
#>   n_trt=150,         # 150 patients in the treatment arm
#>   n_ctrl=150,        # 150 patients in the control arm
#>   followup=m2d(24),  # followup 2 years
#>   recruitment=m2d(6) # recruitment time 6 months
#> 
#> )
Design <- merge(Assumptions, Options, by=NULL)
knitr::kable(Design)| delay | hazard_ctrl | hazard_trt | random_withdrawal | n_trt | n_ctrl | followup | recruitment | 
|---|---|---|---|---|---|---|---|
| 0.000 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 
| 60.875 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 
| 121.750 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 
| 182.625 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 
| 243.500 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 
| 304.375 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 
Define the data generating ‘generate’ function, that beside simulating the time to event also applies the different censoring processes.
Next, we need to specify a summary function that computes the desired operating characteristics for each simulation scenario, and each analysis method. In the example below, we use a summary function that computes the power (as we consider scenarios under the alternative in the example) of the log-rank test and the max-combo test across simulated scenarios. For each scenario the function just averages the number of times the computed p-value is below the significance level obtain the power.
The results object contains the results of all replications of the
corresponding method for each row of the Design object. In
this example results$p contains all N_sim
p-values returned by the analyse_maxcombo or
analyse_logrank functions respectively. The Summary will
contain columns with the rejection rate and some other summary
statistics for both methods.
Now we put it all together: in Design we give the scenarios defined
before. We want to run 100 replications for each scenario. We want to
generate data using the generate_delayed_effect function
using the parameters from Design and analyse each
replication of each scenario with the two functions
analyse_logrank and analyse_maxcombo. The
output should be summarised with the Summarise function
defined before and the simulations should be run in parallel.
res <- runSimulation(
  Design,
  replications = N_sim,
  generate = my_generator,
  analyse = list(
    logrank  = analyse_logrank(),
    maxcombo = analyse_maxcombo()
  ),
  summarise = Summarise,
  cl = cl,
  save=FALSE
)
#> 
#> Number of parallel clusters in use: 8
#> 
#> Simulation complete. Total execution time: 14.28sFinally we select the interesting columns from the output. Since all other parameters are the same for each scenario we just select delay. And we are interested in the rejection rate of the tests.
res |> 
  subset(select=c("delay", "maxcombo.rejection_0.05", "logrank.rejection_0.05")) |>
  knitr::kable()| delay | maxcombo.rejection_0.05 | logrank.rejection_0.05 | 
|---|---|---|
| 0.000 | 0.47 | 0.50 | 
| 60.875 | 0.45 | 0.46 | 
| 121.750 | 0.37 | 0.32 | 
| 182.625 | 0.27 | 0.23 | 
| 243.500 | 0.20 | 0.14 | 
| 304.375 | 0.16 | 0.12 | 
In this scenario we extend the scenario from above to include a fixed followup as well as an interim analysis after a fixed number of events. For this we will define additional analyse functions.
First we extend the Design to include a column with the number of events after which an interim analysis should be done.
Options <- design_group_sequential()
#> expand.grid(
#>   n_trt=200,          # 200 patients in the treatment arm
#>   n_ctrl=200,         # 200 patients in the control arm
#>   followup=m2d(48),   # maximum followup time 4 years
#>   recruitment=m2d(6), # recruitment time 6 months
#>   interim_events=150, # interim analysis after 150 events
#>   final_events=300    # final analysis after 300 events
#> )
Design <- merge(Assumptions, Options, by=NULL)
knitr::kable(Design)| delay | hazard_ctrl | hazard_trt | random_withdrawal | n_trt | n_ctrl | followup | recruitment | interim_events | final_events | 
|---|---|---|---|---|---|---|---|---|---|
| 0.000 | 0.0009489 | 0.0006326 | 0.0001898 | 200 | 200 | 1461 | 182.625 | 150 | 300 | 
| 60.875 | 0.0009489 | 0.0006326 | 0.0001898 | 200 | 200 | 1461 | 182.625 | 150 | 300 | 
| 121.750 | 0.0009489 | 0.0006326 | 0.0001898 | 200 | 200 | 1461 | 182.625 | 150 | 300 | 
| 182.625 | 0.0009489 | 0.0006326 | 0.0001898 | 200 | 200 | 1461 | 182.625 | 150 | 300 | 
| 243.500 | 0.0009489 | 0.0006326 | 0.0001898 | 200 | 200 | 1461 | 182.625 | 150 | 300 | 
| 304.375 | 0.0009489 | 0.0006326 | 0.0001898 | 200 | 200 | 1461 | 182.625 | 150 | 300 | 
The analyse_group_sequential function allows to combine
two or more analyse functions to create an analysis function
corresponding to a group sequential design. The arguments are the times
or events after which the analyses are done, the nominal alpha at each
stage and the analyse functions to be used at each stage.
## O'Brien-Fleming Bounds for GSD with interim analysis at information time 1/2
nominal_alpha <- ldbounds::ldBounds(c(0.5,1))$nom.alpha
clusterExport(cl, "nominal_alpha")
Analyse <-  list(
  logrank_seq  = analyse_group_sequential(
    followup = c(condition$interim_events, condition$final_events),
    followup_type = c("event", "event"),
    alpha = nominal_alpha,
    analyse_functions = analyse_logrank()
  ),
  maxcombo_seq = analyse_group_sequential(
    followup = c(condition$interim_events, condition$final_events),
    followup_type = c("event", "event"),
    alpha = nominal_alpha,
    analyse_functions = analyse_maxcombo()
  )
)The output of the function created with
analyse_group_sequential contains additional columns.
rejected_at_stage includes the stage at which the null was
first rejected or Inf if the null was not rejected,
N_pat and N_evt contain the number of patients
recruited and the number of events observed before the null was rejected
and followup contains the time after study start at which
the last analysis was done.
The results object also includes the results returned by each stage
in results_stages, but here we only use the overall
test-decision.
The call to runSimulation looks almost the same as above
but now the additional columns we defined in our Summarise
functions are included in the result.
res <- runSimulation(
  Design,
  replications = N_sim,
  generate = my_generator,
  analyse = Analyse,
  summarise = Summarise,
  cl = cl,
  save=FALSE
)
#> 
#> Number of parallel clusters in use: 8
#> 
#> Simulation complete. Total execution time: 30.07sIn the case of a group sequential design we are also interested in the average running time of the study in terms of patients recruited, number of events and running time of the study.
res |>
  subset(select=c(
    "delay", 
    "maxcombo_seq.rejection", "logrank_seq.rejection",
    "maxcombo_seq.n_pat", "logrank_seq.n_pat",
    "maxcombo_seq.n_evt", "logrank_seq.n_evt",
    "maxcombo_seq.followup", "logrank_seq.followup"
    )) |>
  knitr::kable()| delay | maxcombo_seq.rejection | logrank_seq.rejection | maxcombo_seq.n_pat | logrank_seq.n_pat | maxcombo_seq.n_evt | logrank_seq.n_evt | maxcombo_seq.followup | logrank_seq.followup | 
|---|---|---|---|---|---|---|---|---|
| 0.000 | 0.82 | 0.84 | 400 | 400 | 213.73 | 212.11 | 1271.03 | 1256.83 | 
| 60.875 | 0.77 | 0.76 | 400 | 400 | 215.33 | 218.62 | 1280.30 | 1308.13 | 
| 121.750 | 0.73 | 0.70 | 400 | 400 | 228.36 | 231.12 | 1362.40 | 1385.58 | 
| 182.625 | 0.71 | 0.65 | 400 | 400 | 223.63 | 228.97 | 1320.08 | 1368.20 | 
| 243.500 | 0.55 | 0.49 | 400 | 400 | 238.05 | 239.26 | 1426.16 | 1436.63 | 
| 304.375 | 0.58 | 0.50 | 400 | 400 | 237.93 | 241.80 | 1411.85 | 1442.90 | 
To evaluate the performance of an estimator, we first compute the values of some true summary statistics to which the estimates will be compared. The most relevant true summary statistics can be computed by a convenience function for each scenario. Just pipe the Design data.frame to the function and the values of the statistics are added as columns.
Options <- design_fixed_followup()
#> expand.grid(
#>   n_trt=150,         # 150 patients in the treatment arm
#>   n_ctrl=150,        # 150 patients in the control arm
#>   followup=m2d(24),  # followup 2 years
#>   recruitment=m2d(6) # recruitment time 6 months
#> 
#> )
Design <- merge(Assumptions, Options, by=NULL)
Design <- Design |> 
  true_summary_statistics_delayed_effect(cutoff_stats = 20)
knitr::kable(Design)| delay | hazard_ctrl | hazard_trt | random_withdrawal | n_trt | n_ctrl | followup | recruitment | median_survival_trt | median_survival_ctrl | rmst_trt_20 | rmst_ctrl_20 | gAHR_20 | AHR_20 | AHRoc_20 | AHRoc_robust_20 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.000 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 1095.7500 | 730.5 | 19.87402 | 19.81142 | 0.6666667 | 0.6666667 | 0.6666667 | 0.6666667 | 
| 60.875 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 1065.3125 | 730.5 | 19.81142 | 19.81142 | 1.0000000 | 1.0000000 | 1.0000000 | 1.0000000 | 
| 121.750 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 1034.8750 | 730.5 | 19.81142 | 19.81142 | 1.0000000 | 1.0000000 | 1.0000000 | 1.0000000 | 
| 182.625 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 1004.4375 | 730.5 | 19.81142 | 19.81142 | 1.0000000 | 1.0000000 | 1.0000000 | 1.0000000 | 
| 243.500 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 974.0000 | 730.5 | 19.81142 | 19.81142 | 1.0000000 | 1.0000000 | 1.0000000 | 1.0000000 | 
| 304.375 | 0.0009489 | 0.0006326 | 0.0001898 | 150 | 150 | 730.5 | 182.625 | 943.5625 | 730.5 | 19.81142 | 19.81142 | 1.0000000 | 1.0000000 | 1.0000000 | 1.0000000 | 
Summarise functionIn the Summarise function the true value against which the estimator should be compared has to be specified. If coverage and average width of the confidence intervals should be estimated, the CI bounds should also be specified.
The arguments to the functions are left un-evaluated and are later
evaluated in the results and condition
datasets respectively. So any expressions using variables from results
can be used for the estimated value and the CI bounds and expressions
using variables from condition can be used in the argument for the real
value.
In this case we want to compare the hazard ratio estimated by the Cox model to the geometric average hazard ratio as well as to the hazard ratio after onset of treatment, calculated from the two respective columns of the Design data.frame.
Note that one name can be used twice to summarise the output of one analysis method two times, like in this case, comparing it to two different summary statistics.
Analyse <- list(
  coxph = analyse_coxph()
)
res <- runSimulation(
  Design,
  replications = N_sim,
  generate = my_generator,
  analyse = Analyse,
  summarise = Summarise,
  cl = cl,
  save=FALSE
)
#> 
#> Number of parallel clusters in use: 8
#> 
#> Simulation complete. Total execution time: 0.93sres |>
  subset(select=c(
    "delay", "coxph.HR.bias", "coxph.gAHR.bias", "coxph.HR.mse", 
    "coxph.gAHR.mse", "coxph.HR.coverage", "coxph.gAHR.coverage"
  )) |> 
  knitr::kable()| delay | coxph.HR.bias | coxph.gAHR.bias | coxph.HR.mse | coxph.gAHR.mse | coxph.HR.coverage | coxph.gAHR.coverage | 
|---|---|---|---|---|---|---|
| 0.000 | 0.0307004 | 0.0307004 | 0.0171681 | 0.0171681 | 0.97 | 0.97 | 
| 60.875 | 0.0437173 | -0.2896160 | 0.0213150 | 0.1032812 | 0.94 | 0.55 | 
| 121.750 | 0.0715696 | -0.2617637 | 0.0222560 | 0.0856541 | 0.91 | 0.63 | 
| 182.625 | 0.1197364 | -0.2135969 | 0.0353176 | 0.0666045 | 0.87 | 0.72 | 
| 243.500 | 0.1591968 | -0.1741365 | 0.0495382 | 0.0545181 | 0.81 | 0.78 | 
| 304.375 | 0.1861441 | -0.1471892 | 0.0633401 | 0.0503551 | 0.73 | 0.84 |