transform.hazards

Transforms cumulative hazard estimates to estimate survival analysis parameters specified by differential equations. We demonstrate how the method works on some commonly studied parameters.

Parameters

We are interested in assessing parameters that solve differential equations driven by cumulative hazards \(A\), i.e. \[\begin{equation}X_t = X_0 + \int_0^t F(X_s) dA_s,\end{equation}\] where \(F = (F_1,F_2,\cdots)\) is Lipschitz, two times continuously differentiable, and satisfies a linear growth bound. Several examples of such parameters can be found in ¹ ². The main function pluginEstimate in this package estimate such parameters. Among other things, pluginEstimate has the following input

The increments of the cumulative hazard estimates
The function \(F\)
The Jacobian matrices of the columns of \(F\), i.e. \(J_{F_1},J_{F_2},\cdots\), as a list

We illustrate how to provide the correct inputs by use of examples below.

Survival

The survival function \(S\) solves a differential equation with initial value 1; \[\begin{equation} S_t = 1 - \int_0^t S_{s} dA_s.\end{equation}\]

We generate a sample of exponentially distributed survival times with independent censoring, and calculate cumulative hazard estimates

n1 <- 300
Samp1 <- rexp(n1,1)
cTimes <- runif(n1)*4
Samp1 <- pmin(Samp1,cTimes)
isNotCensored <- cTimes != Samp1
startTimes1 <- rep(0,n1)

aaMod1 <- aalen(Surv(startTimes1,Samp1,isNotCensored)~1)

We want to estimate the survival function. The cumulative hazard estimates are inserted as a time-ordered matrix where each column is an increment of the vector of cumulatie hazard estimates;

dA_est1 <- diff(c(0,aaMod1$cum[,2]))
hazMatrix <- matrix(dA_est1,nrow=1)

In this case, \(F\) takes the simple form \(F(x) = -x\), and its Jacobian is therefore \(J_{F}(x) = -1\). These must be provided as matrix-valued functions;

F_survival <- function(X)matrix(-X,nrow=1,ncol=1)
F_survival_JacobianList <- list(function(X)matrix(-1,nrow=1,ncol=1))

We obtain plugin estimates of \(S\) and its covariance using the call

param <- pluginEstimate(n1, hazMatrix, F_survival, F_survival_JacobianList,matrix(1,nrow=1,ncol=1),matrix(0,nrow=1,ncol=1))

Finally, we may plot the results with approximate 95 % confidence intervals

times1 <- aaMod1$cum[,1]
plot(times1,param$X,type="s",xlim=c(0,4),xlab="t",ylab="",main="Survival")
lines(times1,param$X + 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2)
lines(times1,param$X - 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2)
lines(times1,exp(-times1),col=2)
legend("topright",c("estimated","true"),lty=1,col=c(1,2),bty="n")

Restricted mean survival

The restricted mean survival function \(R\) solves the system \[\begin{equation} \begin{pmatrix} S_t \\ R_t \end{pmatrix} = \begin{pmatrix}1 \\ 0 \end{pmatrix} + \int_0^t \begin{pmatrix} -S_{s} & 0 \\ 0 & S_s \end{pmatrix}d \begin{pmatrix}A_s \\ s \end{pmatrix},\end{equation}\]

where \(S\) is the survival function. Here, the colums of \(F\), \(F_1\) and \(F_2\), are given by \[\begin{equation} F_1(x_1,x_2) = \begin{pmatrix} -x_1 \\ 0 \end{pmatrix}, F_2(x_1,x_2) = \begin{pmatrix} 0 \\ x_1 \end{pmatrix}.\end{equation}\] The Jacobian matrices are therefore \[\begin{equation} J_{F_1}(x_1,x_2) = \begin{pmatrix} -1 & 0 \\ 0 & 0 \end{pmatrix}, J_{F_2}(x_1,x_2) = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}. \end{equation}\] We define these along with initial values below

F_restrict <- function(X)matrix(c(-X[1],0,0,X[1]),nrow=2)
F_restrict_JacobianList <- list(function(X)matrix(c(-1,0,0,0),nrow=2),
                                function(X)matrix(c(0,0,1,0),nrow=2,byrow=T))

X0_restrict <- matrix(c(1,0),nrow=2)
V0_restrict <- matrix(0,nrow=2,ncol=2)

The restricted mean survival is a ‘regular’ (i.e. Lebesgue) integral, and we must therefore provide the time increments. We choose the time interval \([0,4]\) in \(10^4\) increments:

fineTimes <- seq(0,4,length.out = 1e4+1)

tms <- sort(unique(c(fineTimes,times1)))

hazMatrix <- matrix(0,nrow=2,ncol=length(tms))
hazMatrix[1,match(times1,tms)] <- dA_est1
hazMatrix[2,] <- diff(c(0,tms))

We obtain plugin estimates using the call (note the last argument that in this example can be used to improve efficiency);

param <- pluginEstimate(n1, hazMatrix, F_restrict, F_restrict_JacobianList,X0_restrict,V0_restrict,isLebesgue = 2)

We plot the results

plot(tms,param$X[2,],type="s",xlim=c(0,4),ylim=c(0,1.5),xlab="t",ylab="",main="Restricted mean survival")
lines(tms,param$X[2,] + 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2)
lines(tms,param$X[2,] - 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2)
lines(tms,1 - exp(-tms),col=2)
legend("topleft",c("estimated","true"),lty=1,col=c(1,2),bty="n")

Relative survival

We generate another set of exponentially distributed survival times and compare the two groups. A new matrix of cumulative hazard increments must be created;

n2 <- 200
Samp2 <- rexp(n2,1.3)
censTimes2 <- runif(n2)*3
Samp2 <- pmin(Samp2,censTimes2)
isNotCensored2 <- censTimes2 != Samp2
startTimes2 <- rep(0,n2)

aaMod2 <- aalen(Surv(startTimes2,Samp2,isNotCensored2)~1)
dA_est2 <- diff(c(0,aaMod2$cum[,2]))

times2 <- aaMod2$cum[,1]
times <- sort(unique(c(times1,times2)))

hazMatrix <- matrix(0,nrow=2,ncol=length(times))
hazMatrix[1,match(times1,times)] <- dA_est1
hazMatrix[2,match(times2,times)] <- dA_est2

The relative survival function \(RS\) between groups 1 and 2 solve the equation \[\begin{equation} RS_t = 1 + \int_0^t \begin{pmatrix} -RS_s & RS_s\end{pmatrix} d\begin{pmatrix} A_s^1 \\ A_s^2 \end{pmatrix}, \end{equation}\]

i.e. \(F(x) = (-x,x)\). The Jacobians of the columns of \(F\) are \(J_{F_1}(x) = -1\), and \(J_{F_2}(x) = 1\). We specify these functions as follows:

F_relsurv <- function(X)matrix(c(-X,X),nrow=1)
F_relsurv_JacobianList <- list(function(X)matrix(-1,nrow=1),
                               function(X)matrix(1,nrow=1))

The estimates are then obtained by the call

param <- pluginEstimate(n1+n2, hazMatrix, F_relsurv, F_relsurv_JacobianList,matrix(1,nrow=1,ncol=1),matrix(0,nrow=1,ncol=1))

We may plot the results with approximate 95% confidence intervals

plot(times,param$X,type="s",xlim=c(0,4),ylim=c(-0.5,4),xlab="t",ylab="",main="Relative survival")
lines(times,param$X + 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2)
lines(times,param$X - 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2)
lines(times,exp(-times*(1/1.3 - 1)),col=2)
legend("topleft",c("estimated","true"),lty=1,col=c(1,2),bty="n")

Cumulative incidence

We may be in a situation with two competing risks. The cumulative incidences \(C^1,C^2\) solve the system \[\begin{equation} \begin{pmatrix} S_t \\ C_t^1 \\ C_t^2 \end{pmatrix} = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} + \int_0^t \begin{pmatrix} -S_s & -S_s \\ S_s & 0 \\ 0 & S_s \end{pmatrix} d \begin{pmatrix} A^1_s \\ A^2_s \end{pmatrix}, \end{equation}\] where \(S\) is the survival. The first columns of \(F\) are

\[\begin{equation}F_1(x_1,x_2,x_2) = \begin{pmatrix} -x_1 \\ x_1 \\ 0 \end{pmatrix}, F_2(x_1,x_2,x_2) = \begin{pmatrix} -x_1 \\ 0 \\ x_1 \end{pmatrix} \end{equation}\]

The reader may verify that the Jacobian matrix of \(F_1\) and \(F_1\) are \[\begin{equation} J_{F_1}(x_1,x_2,x_3) = \begin{pmatrix} -1 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} , J_{F_2}(x_1,x_2,x_3) = \begin{pmatrix} -1 & 0 & 0 \\ 0 & 0 & 0 \\ 1 & 0 & 0 \end{pmatrix} \end{equation}\]

We specify the function \(F\) and the list of the Jacobian matrices below, along with the initial values:

F_cuminc <- function(X)matrix(c(-X[1],-X[1],X[1],0,0,X[1]),nrow=3,byrow=T)
F_cuminc_JacobianList <- list(function(X)matrix(c(-1,0,0,1,0,0,0,0,0),nrow=3,byrow=T),
                               function(X)matrix(c(-1,0,0,0,0,0,1,0,0),nrow=3,byrow=T))
X0_cuminc <- matrix(c(1,0,0),nrow=3)
v0_cuminc <- matrix(0,nrow=3,ncol=3)

Now for obtaining hazard estimates. We generate survival times first, before sampling two causes of death (and censoring). We then estimate the cause-specific cumulative hazards, and transform the estimates:

# Generate the data
dfr <- data.frame(from=0,to=rexp(n1+n2,1),from.state = 1)

# Adding causes of death (to.state = 2 or 3) and censoring (to.state = 0)
dfr$to.state <- sample(c(0,2,3),n1+n2,replace=T,prob=c(0.2,0.2,0.6))


# Obtaining cause-specific cumulative hazard estimates and extracting the increments
aamod22 <- aalen(Surv(from,to,to.state %in% c(2)) ~1, data=dfr)
aamod33 <- aalen(Surv(from,to,to.state %in% c(3)) ~1, data=dfr)
tmss = sort(unique(c(aamod22$cum[,1],aamod33$cum[,1])))
hazMatrix = matrix(0,nrow=2,ncol=length(tmss))
mt1 = match(aamod22$cum[,1],tmss)
mt2 = match(aamod33$cum[,1],tmss)
hazMatrix[1,mt1] = diff(c(0, aamod22$cum[,2] ))
hazMatrix[2,mt2] = diff(c(0, aamod33$cum[,2] ))

# Transforming the estimates
param <- pluginEstimate(n1+n2,hazMatrix,F_cuminc,F_cuminc_JacobianList,X0_cuminc,v0_cuminc)

Finally, we plot the estimates of the cumulative incidences \(C^1\) and \(C^2\) with confidence intervals:

plot(tmss,param$X[2,],type="s",xlim=c(0,4),ylim=c(0,1),xlab="t",ylab="",main="Cumulative incidence")
lines(tmss,param$X[2,] + 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2)
lines(tmss,param$X[2,] - 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2)
lines(tmss,param$X[3,],type="s",col=3)
lines(tmss,param$X[3,] + 1.96 * sqrt(param$covariance[3,3,]),type="s",lty=2,col=3)
lines(tmss,param$X[3,] - 1.96 * sqrt(param$covariance[3,3,]),type="s",lty=2,col=3)
legend("topleft",c(expression(C^1),expression(C^2)),lty=1,col=c(1,3),bty="n")

Restricted mean survival – comparing two groups

By re-specifying the ODE system we can compare two groups. We illustrate this in the restricted mean survival example. Consider the system

\[\begin{equation} \begin{pmatrix} R_t^1 \\ R^2 \\ RD \\ S^1 \\ S^2 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 1 \\ 1 \end{pmatrix} + \int_0^t \begin{pmatrix} S^1_{s} & 0 & 0 \\ S^2_{s} & 0 & 0 \\ S^1_{s} - S^2_{s} & 0 & 0 \\ 0 & -S_s^1 & 0 \\ 0 & 0 & -S_s^2 \end{pmatrix}d \begin{pmatrix}s \\ A_s^1 \\ A_s^2 \end{pmatrix},\end{equation}\]

where \(S^i\) is the survival function in group \(i\). We specify matrix-valued functions and initial values as before:

F_restrict_diff <- function(X)matrix(c(X[4],0,0,
                                       X[5],0,0,
                                       X[4]-X[5],0,0,
                                       0,-X[4],0,
                                       0,0,-X[5]),nrow=5,byrow=T)
F_restrict_diff_JacobianList <- list(function(X)matrix(c(0,0,0,1,0,
                                                         0,0,0,0,1,
                                                         0,0,0,1,-1,
                                                         rep(0,10)),nrow=5,byrow=T),
                                     function(X)matrix(c(rep(0,18),-1,rep(0,6)),nrow=5,byrow=T),
                                     function(X)matrix(c(rep(0,24),-1),nrow=5,byrow=T))

X0_restrict_diff <- matrix(c(0,0,0,1,1),nrow=5)
V0_restrict_diff <- matrix(0,nrow=5,ncol=5)

We compare RMS for males and females in the flchain data set in the survival package. Specifying hazmatrix:

aa_male = aalen(Surv(futime,death==1)~1, data = flchain[flchain$sex=="M",]) 
aa_female = aalen(Surv(futime,death==1)~1, data = flchain[flchain$sex=="F",]) 

tms <- sort(unique(c(fineTimes,aa_male$cum[,1], aa_female$cum[,1])))
mt_male = match(aa_male$cum[,1],tms)
mt_female = match(aa_female$cum[,1],tms)

hazMatrix <- matrix(0,nrow=3,ncol=length(tms))
hazMatrix[1,] <- diff(c(0,tms))
hazMatrix[2,mt_male] <- diff(c(0,aa_male$cum[,2]))
hazMatrix[3,mt_female] <- diff(c(0,aa_female$cum[,2]))
  
  
plugEst <- pluginEstimate(1, hazMatrix, F_restrict_diff, F_restrict_diff_JacobianList,X0_restrict_diff,V0_restrict_diff,isLebesgue = 1)
param = list(plugEst=plugEst, tms=tms)

Plotting \(\hat{RD} = \hat R^1 - \hat R^2\) with 95% confidence intervals:

ylims = c(1.1*min(param$plugEst$X[3,] - 1.96 * sqrt(param$plugEst$covariance[3,3,])),
          1.1*max(param$plugEst$X[3,] + 1.96 * sqrt(param$plugEst$covariance[3,3,])))
plot(param$tms,param$plugEst$X[3,],type="s",ylim=ylims,ylab="",main="RMS difference",xlab="years")
lines(param$tms,param$plugEst$X[3,] + 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2)
lines(param$tms,param$plugEst$X[3,] - 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2)
abline(a=0,b=0,col=2)

Cumulative incidence – comparing two groups

Consider a similar example with differences of cumulative incidence functions

\[\begin{equation} \begin{pmatrix} S^a_t \\ S_t^b \\ C_t^{a,1} \\ C_t^{a,2} \\ C_t^{b,1} \\ C_t^{b,2} \\ CD_t^1 \\ CD_t^2 \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix} + \int_0^t \begin{pmatrix} -S_s^a & -S_s^a & 0 & 0 \\ 0 & 0 & -S_s^b & -S_s^b \\ S_s^a & 0 & 0 & 0 \\ 0 & S_s^a & 0 & 0 \\ 0 & 0 & S_s^b & 0 \\ 0 & 0 & 0 & S_s^b \\ S_s^a & 0 & -S_s^b & 0 \\ 0 & S_s^a & 0 & -S_s^b \end{pmatrix} d \begin{pmatrix} A^{a,1}_s \\ A^{a,2}_s \\ A^{b,1}_s \\ A^{b,2}_s \end{pmatrix}, \end{equation}\] where \(S^a\) is the survival in group \(a\) and \(S^b\) is the survival in group \(b\), and \(C^{a,j}\) is the cumulative incidence for cause \(j\) in group \(a\) and similarly for group \(b\). \(CD^i = C^{a,i} - C^{b,i}\) is the cumulative incidence difference for cause \(i\). We specify the ODE system.

F_cuminc_diff <- function(X)matrix( c(-X[1],-X[1],0,0,
                                      0,0,-X[2],-X[2],
                                      X[1],0,0,0,
                                      0,X[1],0,0,
                                      0,0,X[2],0,
                                      0,0,0,X[2],
                                      X[1],0,-X[2],0,
                                      0,X[1],0,-X[2]) ,nrow=8,byrow=T)
F_cuminc_diff_JacobianList <- list(function(X)matrix(c(-1, rep(0,7),
                                                       rep(0,8),
                                                       1, rep(0,7),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       1, rep(0,7),
                                                       rep(0,8)),nrow=8,byrow=T),
                                   function(X)matrix(c(-1, rep(0,7),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       1, rep(0,7),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       1, rep(0,7)),nrow=8,byrow=T),
                                   function(X)matrix(c(rep(0,8),
                                                       0,-1, rep(0,6),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       0,1, rep(0,6),
                                                       rep(0,8),
                                                       0,-1, rep(0,6),
                                                       rep(0,8)),nrow=8,byrow=T),
                                   function(X)matrix(c(rep(0,8),
                                                       0,-1, rep(0,6),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       rep(0,8),
                                                       0,1, rep(0,6),
                                                       rep(0,8),
                                                       0,-1, rep(0,6)),nrow=8,byrow=T))


X0_cuminc_diff <- matrix(c(1,1,0,0,0,0,0,0),nrow=8)
v0_cuminc_diff <- matrix(0,nrow=8,ncol=8)

We inspect cumulative incidences of the competing events PCM, and death without malignancy, in the mgus2 data set. We compare males and females in this data set. Specifying hazmatrix:

mgus2$etime <- with(mgus2, ifelse(pstat==0, futime, ptime))
event <- with(mgus2, ifelse(pstat==0, 2*death, 1))
mgus2$event <- factor(event, 0:2, labels=c("censor", "pcm", "death"))


fr_M <- mgus2[mgus2$sex == "M",]
fr_M$time <- fr_M$etime/12
fr_F <- mgus2[mgus2$sex == "F",]
fr_F$time <- fr_F$etime/12

fit_PCM_M = aalen(Surv(time,event=="pcm")~1,data=fr_M)
fit_death_M = aalen(Surv(time,event=="death")~1,data=fr_M)

fit_PCM_F = aalen(Surv(time,event=="pcm")~1,data=fr_F)
fit_death_F = aalen(Surv(time,event=="death")~1,data=fr_F)

tms2 = sort(unique(c(fit_PCM_M$cum[,1],fit_PCM_F$cum[,1],
                     fit_death_M$cum[,1],fit_death_F$cum[,1])))

dA_PCM_M = dA_death_M = dA_PCM_F = dA_death_F = rep(0, length(tms2))

dA_PCM_M[match(fit_PCM_M$cum[,1], tms2)] = diff(c(0,fit_PCM_M$cum[,2]))
dA_PCM_F[match(fit_PCM_F$cum[,1], tms2)] = diff(c(0,fit_PCM_F$cum[,2]))

dA_death_M[match(fit_death_M$cum[,1], tms2)] = diff(c(0,fit_death_M$cum[,2]))
dA_death_F[match(fit_death_F$cum[,1], tms2)] = diff(c(0,fit_death_F$cum[,2]))

hazMatrix <- matrix(0,nrow=4,ncol=length(tms2))
hazMatrix[1,] <- dA_PCM_M
hazMatrix[2,] <- dA_death_M
hazMatrix[3,] <- dA_PCM_F
hazMatrix[4,] <- dA_death_F

plugEst <- pluginEstimate(1, hazMatrix, F_cuminc_diff, F_cuminc_diff_JacobianList,X0_cuminc_diff,v0_cuminc_diff)
param = list(plugEst=plugEst, tms=tms2)

Plotting \(\hat{CD}^1\) and \(\hat{CD}^2\) with 95% confidence intervals:

# Colors
male_color <- "red"
female_color <- "blue"
diff_color <- "gray"

par(mfrow=c(1,2))
# Plot with colors and labels
plot(param$tms, param$plugEst$X[3,],type="s",ylim=c(-0.2,0.9),main="Cum.inc. PCM",
     ylab="",xlab="Years", col=male_color)
lines(param$tms, param$plugEst$X[3,] + 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2, col=male_color)
lines(param$tms, param$plugEst$X[3,] - 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2, col=male_color)

lines(param$tms, param$plugEst$X[5,],type="s", col=female_color)
lines(param$tms, param$plugEst$X[5,] + 1.96 * sqrt(param$plugEst$covariance[5,5,]),type="s",lty=2, col=female_color)
lines(param$tms, param$plugEst$X[5,] - 1.96 * sqrt(param$plugEst$covariance[5,5,]),type="s",lty=2, col=female_color)

lines(param$tms, param$plugEst$X[7,],type="s", col=diff_color)
lines(param$tms, param$plugEst$X[7,] + 1.96 * sqrt(param$plugEst$covariance[7,7,]),type="s",lty=2, col=diff_color)
lines(param$tms, param$plugEst$X[7,] - 1.96 * sqrt(param$plugEst$covariance[7,7,]),type="s",lty=2, col=diff_color)

abline(a=0,b=0,col=2)

# Add legend
legend("topleft", legend = c("Males", "Females", "Difference"), col = c(male_color, female_color, diff_color), lty = 1,bty="n")




# Plot with colors and labels
plot(param$tms, param$plugEst$X[4,],type="s",ylim=c(-0.2,0.9),main="Cum.inc. death",
     ylab="",xlab="Years", col=male_color)
lines(param$tms, param$plugEst$X[4,] + 1.96 * sqrt(param$plugEst$covariance[4,4,]),type="s",lty=2, col=male_color)
lines(param$tms, param$plugEst$X[4,] - 1.96 * sqrt(param$plugEst$covariance[4,4,]),type="s",lty=2, col=male_color)

lines(param$tms, param$plugEst$X[6,],type="s", col=female_color)
lines(param$tms, param$plugEst$X[6,] + 1.96 * sqrt(param$plugEst$covariance[6,6,]),type="s",lty=2, col=female_color)
lines(param$tms, param$plugEst$X[6,] - 1.96 * sqrt(param$plugEst$covariance[6,6,]),type="s",lty=2, col=female_color)

lines(param$tms, param$plugEst$X[8,],type="s", col=diff_color)
lines(param$tms, param$plugEst$X[8,] + 1.96 * sqrt(param$plugEst$covariance[8,8,]),type="s",lty=2, col=diff_color)
lines(param$tms, param$plugEst$X[8,] - 1.96 * sqrt(param$plugEst$covariance[8,8,]),type="s",lty=2, col=diff_color)

abline(a=0,b=0,col=2)

# Add legend
legend("topleft", legend = c("Males", "Females", "Difference"), col = c(male_color, female_color, diff_color), lty = 1,bty="n")

[Transforming cumulative hazard estimates] (Biometrika, 2018).↩︎
[On null hypothesis in survival analysis] (Biometrics, 2019)↩︎