Abstract
In randomized clinical trials, the log-rank test and Cox proportional hazards model are the gold standard in survival data analyses. While the log-rank test is generally valid, in the presence of non-proportional hazards, the power can be substantially decreased relative to the proportional hazards assumptions under which studies are usually designed. In contrast, weighted log-rank tests can be more powerful for specific treatment differences under non-proportional hazards scenarios. However, a poor choice of the weighting form can be detrimental. Recent work on combining various weighted log-rank tests allows for tests that are capable of detecting treatment effects across a broad range of non-proportional hazards scenarios. In this paper, we expand on these ideas with a framework based on a flexible resampling approach [5] which allows for the combination of various testing procedures in addition to weighted log-rank tests. In particular, we describe how tests based on restricted mean survival time (RMST) comparisons can be included within combinations of weighted log-rank tests as well as other test statistics such as Tarone-Ware and Renyi-type supremum families. For estimation, we propose companion weighted Cox model estimators [14, 21, 22] which utilize the weighting form that is “selected” through the combination test and provide simultaneous confidence intervals. The performance of various combinations and their companion Cox estimators as well as RMST are evaluated in simulation studies under null, proportional hazards, late-separation, and early-separation scenarios. We find the combination tests perform quite well in controlling type-1 error rates and in achieving higher power than individual tests across the scenarios considered here. We suggest the companion Cox estimators are a natural link to the testing procedures and can be a useful complementary summary of treatment effects with careful interpretation. For illustration, we apply the proposals to a randomized clinical trial study of the PD-L1-targeted therapy atezolizumab in comparison with docetaxel in previously treated non-small-cell lung cancer patients. R code can be found at https://github.com/larryleon/combination-tests-and-estimators.
Similar content being viewed by others
References
Aalen OO, Cook RJ, Røysland K (2015) Does cox analysis of a randomized survival study yield a causal treatment effect? Lifetime Data Anal 21(4):579–593
Akacha M, Bretz F, Ohlssen D, Rosenkranz G, Schmidli H (2017) Estimands and their role in clinical trials. Stat Biopharm Res 9(3):268–271
Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10(4):1100–1120
Chi Y, Tsai MH (2001) Some versatile tests based on the simultaneous use of weighted log-rank and weighted kaplan-meier statistics. Commun Stat 30(4):743–759
Dobler D, Beyersmann J, Pauly M (2017) Non-strange weird resampling for complex survival data. Biometrika 104(3):699–711
Fleming T, Harrington D (1991) Counting Processes and Survival Analysis. Wiley Series in Probability and Statistics. Wiley, Hoboken
Gill R (1980) Censoring and stochastic integrals. Stat Neerlandica 34(2):124–124
Gill R (1983) Large sample behaviour of the product-limit estimator on the whole line. Ann Stat 11(1):49–58
Goldwasser MA, Tian L, Wei LJ (2004) Statistical inference for infinite-dimensional parameters via asymptotically pivotal estimating functions. Biometrika 91:81–94
Hernán MA (2010) The hazards of hazard ratios. Epidemiology 22(1):13–15
Huang B, Kuan PF (2018) Comparison of the restricted mean survival time with the hazard ratio in superiority trials with a time-to-event end point. Pharm Stat 17(3):202–213
Kosorok MR, Lin CY (1999) The versatility of function-indexed weighted log-rank statistics. J Am Stat Assoc 94(445):320–332
Lee SH (2007) On the versatility of the combination of the weighted log-rank statistics. Comput Stat Data Anal 51(12):6557–6564
Lin DY (1991) Goodness-of-fit analysis for the cox regression model based on a class of parameter estimators. J Am Stat Assoc 86(415):725–728
Lin DY (1997) Non-parametric inference for cumulative incidence functions in competing risks studies. Stat Med 16(8):901–910
Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572
Pepe MS, Fleming TR (1989) Weighted kaplan-meier statistics: a class of distance tests for censored survival data. Biometrics 45(2):497–507
Phillips A, Abellan-Andres J, Soren A, Bretz F, Fletcher C, France L, Garrett A, Harris R, Kjaer M, Keene O, Morgan D, O’Kelly M, Roger J (2016) Estimands: discussion points from the psi estimands and sensitivity expert group. Pharm Stat 16(1):6–11
Rittmeyer A, Barlesi F, Waterkamp D, Park K, Ciardiello F, von Pawel J, Gadgeel SM, Hida T, Kowalski DM, Dols MC, Cortinovis DL, Leach J, Polikoff J, Barrios C, Kabbinavar F, Frontera OA, De Marinis F, Turna H, Lee JS, Ballinger M, Kowanetz M, He P, Chen D, Sandler A, Gandara DR (2017) Atezolizumab versus docetaxel in patients with previously treated non-small-cell lung cancer (oak): a phase 3, open-label, multicentre randomised controlled trial. Lancet 389(10066):255–265
Royston P, Parmar MK (2016) Augmenting the logrank test in the design of clinical trials in which non-proportional hazards of the treatment effect may be anticipated. BMC Med Res Methodol 16(1):16
Sasieni P (1993a) Maximum weighted partial likelihood estimators for the cox model. J Am Stat Assoc 88(421):144–152
Sasieni P (1993b) Some new estimators for cox regression. Ann Stat 21(4):1721–1759. https://doi.org/10.1214/aos/1176349395
Schemper M, Wakounig S, Heinze G (2009) The estimation of average hazard ratios by weighted cox regression. Stat Med 28(19):2473–2489
Stensrud MJ, Røysland K, Ryalen PC (2019) On null hypotheses in survival analysis. Biometrics 75(4):1276–1287
Struthers CA, Kalbfleisch JD (1986) Misspecified proportional hazard models. Biometrika 73(2):363–369
Tarone RE, Ware J (1977) On distribution-free tests for equality of survival distributions. Biometrika 64(1):156–160
Tian L, Liu J, Zhao M, Wei LJ (2004) Statistical inferences based on non-smooth estimating functions. Biometrika 91:943–954
Wang Y, Wu H, Anderson K (2018) NPHSIM: simulation and power calculations for time-to-event clinical trials. https://www.githubcom/keaven/nphsim/
Xu R, O’Quigley J (2000) Estimating average regression effect under non-proportional hazards. Biostatistics 1(4):423–439
Yoshida M, Matsuyama Y (2016) Interim analysis based on the weighted log-rank test for delayed treatment effects under staggered patient entry. J Biopharm Stat 26(5):842–858
Zhao L, Claggett B, Tian L, Uno H, Pfeffer MA, Solomon SD, Trippa L, Wei LJ (2016) On the restricted mean survival time curve in survival analysis. Biometrics 72(1):215–221
Acknowledgements
The authors are grateful to the Atezolizumab Lung Team as well as Zhengrong Li, Kaspar Rufibach, Marcel Wolbers, and Hans Ulrich Burger for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix
Appendix
1.1 Weighted Cox Model Variance Estimation and Simultaneous Confidence Interval Calculations
We briefly outline some details on the weighted Cox estimator and simultaneous confidence intervals based on synthetic-martingale resampling. By Taylor series expansion
where \(\beta _{w}^{**}\) lies between \({\hat{\beta }}_{w}\) and \(\beta _{w}^{*}\), and \(i_{w}(\beta _{w}^{**})=(-1){\partial U_{w}(\beta ) \over \partial \beta }|_{\beta =\beta _{w}^{**}}\) with \(i_{w}(\beta )\) given by
Define \(a_{j}=n_{j}/(n_{0}+n_{1})\) for \(j=0,1\) and
where \(K(t,\beta )\) is equivalent to K(t) in the weighted log-rank statistic defined in (1) but with \({\bar{Y}}_{1}(t)\) substituted with \(\exp (\beta ){\bar{Y}}_{1}(t)\).
Recall that \({\bar{M}}_{j}(t)=\sum _{i=1}^{n}M_{i,j}(t)\) are martingale processes for the control (\(j=0\)) and experimental groups (\(j=1\)) with \(M_{i,j}(t)= N_{i,j}(t)-\int _{0}^{t}Y_{i,j}(s)\lambda _{j}(s)ds\) where \(\lambda _{j}\) denotes the hazard function for group \(j=0,1\). We can then write
We assume uniform convergence in probability for \(K(\cdot ,\cdot )\) and
In addition, let \(\pi _{j}\) denote the limits of \(a_{j}\) for \(j=0,1\). Then,
Now, the first two terms are of the form \(\int _{0}^{\infty }H_{1}d{\bar{M}}_{1}(t) - \int _{0}^{\infty }H_{0}d{\bar{M}}_{0}(t)\) where \(H_{1}=\sqrt{\pi _0 \pi _1}{K(t,\beta _{w}^{*}) \over \exp (\beta _{w}^{*}){\bar{Y}}_{1}(t)}\) and \(H_{0}=\sqrt{\pi _0 \pi _1}{K(t,\beta _{w}^{*}) \over {\bar{Y}}_{0}(t)}\). Let \(h_{1}\) denote the limit of \(H_{1}^{2}\exp (\beta _{w}^{*}){{\bar{Y}}}_{1}\) and \(h_{0}\) denote the limit of \(H_{0}^{2}{\bar{Y}}_{0}\), then following arguments as in Fleming and Harrington [6] (See page 268) the variance of \((n_{0}+n_{1})^{-1/2}U_{w}(\beta _{w}^{*})\) can be estimated by (upon plugging in empirical counterparts for limits)
Note that for \(\beta _{w}^{*}=0\) the score \(U_{w}(0)\) and \(\hat{\sigma }^{2}(U_{w}(0))\) are equivalent to the score statistic (1) and its variance estimate (2). The variance of \({\hat{\beta }}_{w}\) is estimated by plugging-in \({\hat{\beta }}_{w}\) in Eqs. (10) and (12).
Now, for calculating the constant \(c_{max}\) in the simultaneous confidence interval described in (6), we use
where \(U_{w}^{\dagger }({\hat{\beta }}_{w}):= \sum _{i=1}^n\int _{0}^{\infty }w(t)\left\{ Z_{i}-{S^{(1)}(t,{\hat{\beta }}_{w}) \over S^{(0)}(t,{\hat{\beta }}_{w})} \right\} dG_{i}N_{i}(t)\), and the same \(G_{i}\)’s (in analogy to (3)) would be used for each weighted estimator within (6).
Rights and permissions
About this article
Cite this article
León, L.F., Lin, R. & Anderson, K.M. On Weighted Log-Rank Combination Tests and Companion Cox Model Estimators. Stat Biosci 12, 225–245 (2020). https://doi.org/10.1007/s12561-020-09276-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-020-09276-1