\documentclass[a4paper, 11pt]{article} \usepackage[a4paper]{geometry} \usepackage{amsmath, amsthm, amssymb} \usepackage{natbib} \usepackage{hyperref} %\VignetteIndexEntry{Background about Weibull regression} %\VignetteDepends{SurvRegCensCov} %\VignetteDepends{survival} \geometry{a4paper,margin=30mm,bindingoffset=0mm,heightrounded,} \newcommand{\prog}[1]{\textsf{#1}} \newcommand{\pkg}[1]{\texttt{#1}} \newcommand{\code}[1]{\texttt{#1}} \title{Weibull AFT Regression Functions in \prog{R}} \author{Sarah R. Haile\footnote{Epidemiology, Biostatistics and Prevention Institute (EBPI), University of Zurich (\texttt{sarah.haile@uzh.ch})}} <<label = init, echo = FALSE>>= if(require(SurvRegCensCov) == FALSE){install.packages("SurvRegCensCov"); library("SurvRegCensCov")} @ \begin{document} \SweaveOpts{concordance=TRUE} \maketitle Weibull accelerated failure time regression can be performed in \prog{R} using the \pkg{survreg} function. The results are not, however, presented in a form in which the Weibull distribution is usually given. Accelerated failure time models are usually given by $$\log T = Y = \mu + \boldsymbol{\alpha}^T \mathbf{z} + \sigma W,$$ where $\mathbf{z}$ are set of covariates, and $W$ has the extreme value distribution. Given transformations \begin{align*} \gamma & = 1/\sigma, \\ \lambda & = \exp(-\mu/\sigma), \\ \boldsymbol{\beta} & = -\boldsymbol{\alpha}/\sigma, \end{align*} we have a Weibull model with baseline hazard of $$h(x|\mathbf{z}) = (\gamma \lambda t^{\gamma - 1}) \exp(\boldsymbol{\beta}^T \mathbf{z}).$$ Further, the \pkg{survreg} function generally gives $\log \sigma$, rather than $\sigma$ as output. The function \code{WeibullReg} (along with \code{ConvertWeibull}) solves this problem. Hazard ratios ($\exp{(\boldsymbol{\beta}_i)}$) are additionally produced. The function also produces the ``event time ratio'' (ETR, $\exp{(-\boldsymbol{\beta}_i/\gamma)} = \exp \boldsymbol{\alpha}_i$), as discussed in \cite{carroll_03}. This ratio quantifies the relative difference in time it takes to achieve the $p$th percentile between two levels of a covariate. The $p$th percentile of the (covariate-adjusted) Weibull distribution occurs at $$t_p = \left[ \frac{-\log p}{\lambda e^{\boldsymbol{\beta}^T \mathbf{z}}} \right]^{1/\gamma}.$$ Then the ratio of times for a covariate with value $z_1$ versus values $z_0$, with parameter estimate $\beta$, can then be computed as: \begin{align*} \frac{t_B}{t_A} & = \left[ \frac{-\log p}{\lambda e^{\boldsymbol{\beta} z_1}} \right]^{1/\gamma} \left[ \frac{\lambda e^{\boldsymbol{\beta} z_0}}{-\log p} \right]^{1/\gamma} \\ & = \exp \left\{\frac{\boldsymbol{\beta} (z_0-z_1)}{\gamma} \right\}. %& = \exp \left\{\frac{- \beta_B}{\alpha} \right\}. \end{align*} Thus, if we are comparing treatment B to treatment A, where the parameter estimate for treatment B is $\boldsymbol{\beta}_{\mathrm{trt}}$, then the ETR is $\exp\{-\boldsymbol{\beta}_{\mathrm{trt}}/\gamma\}$. For example if the ETR for treatments A vs B is 1.2, then the amount of time it takes for $p$ percent of patients with treatment A to have the event is predicted to be about 20\% longer than it takes for the same percentage of patients with treatment B to experience an event. (That is, treatment B is worse.) For this reason, the ETR can also be called an ``acceleration factor.'' Additionally, a function \pkg{WeibullDiag} has been provided to check the adequacy of the Weibull Model. \section{\pkg{WeibullReg}} The \pkg{WeibullReg} function performs Weibull AFT regression on survival data, returning a list which contains: \begin{description} \item[formula] the regression formula, \item[coef] the coefficient table, \item[HR] a table with the hazard rates (with confidence intervals) for each of the covariates, \item[ETR] a table with the Event Time Ratios (with confidence intervals) for each of the covariates, and \item[summary] the summary table from the original \pkg{survreg} model. \end{description} Such tables can also be produced using the \code{streg} function in \prog{stata} with the following options: 1) the \code{nohr} option gives \code{coef}, 2) without any options gives \code{HR}, 3) the \code{tr} option gives \code{ETR}, and 4) the \code{time} option produces \code{summary}, the original output from \code{survreg}. While \pkg{proc lifereg} in \prog{SAS} can also perform parametric regression for survival data, its output must also be transformed. The following example reproduces Tables 12.1 and 12.2 from \cite{klein_03}, on the larynx data set. <<source>>= library(survival) data(larynx) @ <<WeibullReg>>= WeibullReg(Surv(time, death) ~ factor(stage) + age, data=larynx) @ The hazard rates produced with the Weibull regression model are similar to what is obtained with Cox proportional hazards regression: <<coxph>>= cph <- coxph(Surv(time, death) ~ factor(stage) + age, data=larynx) summary(cph)$conf.int @ Most of the work of the function is actually performed by \code{ConvertWeibull}. These functions require the \pkg{survival} package in \prog{R}. Formulas for the variance estimates come from \cite[Equations 12.2.13-18, with some modifications since \prog{R} gives $\log \sigma$]{klein_03}. \section{\pkg{WeibullDiag}} The \code{WeibullDiag} function produces a diagnostic plot for Weibull AFT regression, similar to what is in \cite[Figure 12.2]{klein_03}. It plots $\log \mathrm{Time}$ versus the log of the estimated cumulative hazard estimate. If the Weibull model has adequate fit, then the plots for each of the covariates should be roughly linear and parallel. This function requires the \code{survfit} object to contain \code{strata}, else an error is produced. The \code{WeibullDiag} function requires the \pkg{survival} package. <<DiagPlot, fig=T>>= WeibullDiag(Surv(time, death) ~ factor(stage), data = larynx, labels=c("Stage I", "Stage II", "Stage III", "Stage IV")) @ %\bibliographystyle{ims} \bibliographystyle{abbrv} \bibliography{refs} \end{document}