# Joint-Model-Based Analysis

Based on the shape of the profiles presented in Figure 8.1, Renard et al. (2003) applied the following LMM to describe the dependence of the logarithm of PSA on time т and treatment Z:

with

where *e _{i}j_{k}*

^{l}~

*N(0,a^)*and

*(U*)

_{1},_{i}j,U_{2},_{i}j,U_{3},_{i}j^{l}~

*N*(0,G). For the survival data, they applied model (8.2) with

The joint model can be fitted to the data with the help of the SAS macro longsurvtst2_2stage . sas at http: //ibiostat. be/online-resources. The macro implements the EM algorithm developed in Henderson, Diggle, and Dobson (2000). In particular, the following syntax can be used:

%longsurv(dataset=psadat, subj=patid, claslong=countryn, ylong=logpsa, time=timeyr, repeated=timyrcls,

W1=int timeyr timysqrt,

Xlong=%str(countryn timeyr*countryn timysqrt*countryn treat*countryn treat*timeyr*countryn treat*timysqrt*countryn),

ysurv=survyr, censvar=survcens, classurv=countryn,

xsurv=countryn treat*countryn,

r2t_lg0=treat*countryn,

r2t_lg1=timeyr*treat*countryn,

r2t_lg2=timysqrt*treat*countryn,

r2t_sv=treat*countryn,

noint=1, niter=500, label=psa45asqrt, tbeg=0, tend=9, step=0.05);

Argument dataset=psadat indicates the name of the dataset. The dataset is in the “long” format, i.e., it contains as many records per patient as there are PSA measurements for the patient. An illustration of a few first records is presented below:

patid |
countryn |
treat |
survyr |
survcens |
timeyr |
timysqrt |
logpsa |
timyrcls |

1 |
7 |
1 |
2.30527 |
0 |
0.00274 |
0.05232 |
3.83081 |
0.00274 |

2 |
7 |
1 |
0.65708 |
1 |
0.00000 |
0.00000 |
2.63189 |
0.00000 |

2 |
7 |
1 |
0.65708 |
1 |
0.04107 |
0.20265 |
2.46810 |
0.04107 |

2 |
7 |
1 |
0.65708 |
1 |
0.09309 |
0.30510 |
3.14415 |
0.09309 |

2 |
7 |
1 |
0.65708 |
1 |
0.16701 |
0.40867 |
3.15274 |
0.16701 |

2 |
7 |
1 |
0.65708 |
1 |
0.34771 |
0.58967 |
2.99072 |
0.34771 |

3 |
7 |
1 |
1.55784 |
1 |
0.00274 |
0.05232 |
3.69138 |
0.00274 |

Variable patid is the patient identifier, countryn is a numerical code for the trial-by-country group, while treat is the treatment indicator (equals 0 for the control and 1 for the experimental treatment). Survival time (in years) and survival status of the patient (0 for alive and 1 for death) are provided in the variables survyr and survcens, respectively. Variable timeyr indicates the time (in years from randomization) at which the log-PSA value logpsa was obtained. Variable timysqrt provides the square root of timeyr, while timyrcls is a copy of timeyr that will be used as a factor, not as a numeric variable.

Argument subj=patid indicates the variable containing patient identifiers, while claslong=countryn specifies that variable countryn is the factor

(class-variable) defining the grouping of patients. The longitudinal surrogate endpoint (log-PSA) is identified by ylong=logpsa and time=timeyr indicates the variable containing the times of the measurements of the endpoint. Argument repeated=timyrcls specifies that variable timyrcls is to be used as a factor defining the order of the measurements of the surrogate endpoint within each patient.

The structure of the longitudinal data model (8.1) is defined by arguments W1 and Xlong. In particular, W1=int timeyr timysqrt specifies the random- effects structure (8.10) of model (8.9), i.e., the random intercept and the random slopes of the measurement time and its square root. The fixed-effects part of the model is specified by the string given in the Xlong argument. In particular, group-by-treatment-specific slopes for the measurement time and the square root of the time are assumed, as specified in (8.9).

The structure of the survival data model (8.2) is defined by arguments ysurv, censvar, xsurv, and classurv. In particular, ysurv=survyr and censvar=survcens indicate the variables containing, respectively, the failure-time true endpoint and the corresponding event indicator. Argument xsurv=countryn treat*countryndefines the fixed-effects part of (8.2), while classurv=countryn indicates that countryn is to be treated as a factor (class- variable). Hence, xsurv=countryn treat*countryn implies the use of group- specific intercepts and treatment effects. Finally, the random-effect structure W_{2} of (8.2) is fixed to be equal to (8.11).

Arguments r2t_lg0, r2t_lgl, and r2t_lg2 specify the terms in the Xlong argument that define the group-specific treatment effects on the surrogate endpoint. In our case, we have group-specific overall effects (r2t_lg0=treat*countryn), linear-time-dependent treatment effects (r2t_lgl=timeyr*treat*countryn), and square root-time- dependent treatment effects (r2t_lg2=timysqrt*treat*countryn). Similarly, r2t_sv=treat*countryn specifies the terms in the xsurv argument that define the group-specific treatment effects on the true endpoint. In our case, these are group-specific overall effects (r2t_sv=treat*countryn).

Finally, argument noint=1 implies that no intercept is to be used in the construction of the design matrices for the LMM and the PH model, while the maximum number of iterations of the EM algorithm is fixed at niter=500. Argument label provides the prefix for the names of the output datasets that are produced by the macro. It is also possible to specify a suffix by using the suff argument.

The macro produces several output datasets. Treatment effects are stored in a dataset named *labeLtrt.effecLsuff.* Estimates of the fixed-effects coefficients and variance components of the longitudinal model (8.1) are stored in a file named *labeLlong_estsuff*, while the estimates of the fixed- and random- effects coefficients of the PH model (8.2) are stored in a file named *la- beLsurv-estsuff.* The file named *labeLexpectsuff* contains the predicted values of the random effects *Ukg.j* of IHiyy. Finally, the file named *labeLrSindivsuff *contains the estimates of *R? _{ndiv}(T*, т) for a grid of values of т between

tbeg=0 and tend=9 in steps of step=0.05. In particular, two estimates of Rndiv( т, т) are provided. One (stored in variable “R2IND”) is computed by using the estimated values of coefficients 7 and the variance-covariance matrix G. The other (stored in variable “R2INDH”) is obtained from the sample variance-covariance matrix of the predicted values of the random-effects vector *(W,ij,W _{2i}ij*)'. Note that all the output files are supposed to be stored in the (permanent) SAS library named longsurv. Thus, before running the SAS macro longsurvtst2_2stage. sas, library longsurv should be created by using an appropriate libname statement.

The estimated values of 71, 72, 73, and 74 for model (8.11 ) are equal to 0.191, 0.288, 0.321, 0.198, respectively. The estimated value of matrix G is

and the residual variance *a*^{2} = 0.128. Using these values and equations similar to (8.6)-(8.8), it is possible to compute the values of R2_{ndiv}(T_{1} , т_{2}), capturing the strength of the individual-level association between the longitudinal surrogate and clinical event processes. The solid curve in Figure 8.2 presents the estimated values of the section R^{2}ndiv(T, т) = R^{2}ndiv( т). The dashed curve presents the estimates obtained from the sample variance-covariance matrix of the predicted values of the random-effects vector (W_{1i}j, W_{2ji}j)'. The two estimates are close to each other. They both suggest that, initially, the strength of the association is low, but it increases in time, reaching a plateau at a value of about 0.9 at one year. Thus, one could conclude that PSA levels measured close to the initiation of a therapy provide relatively little information about patient survival (Renard et al., 2003). With passing time, especially during the first year of treatment, the levels become more informative; afterward, there is no further gain in information.

As mentioned in Section 8.2, the R^{2}ndiv(T) curve should be interpreted with caution as it is strongly dependent on the assumed form of the model.

The estimated treatment effects for the LMM and the PH models are given in Table 8.1.

Figure 8.3 presents the scatter plot matrix of the estimates (the sizes of the circles are proportional to the sample sizes of the corresponding trial- by-country groups). The bottom row contains the plots illustrating the association between Soy, S_{1ji} and S_{2ji} (on the horizontal axis) and /3_{i} (on the vertical axis). The plots suggest that there is not much association between the treatment-effect estimates. The value of Rnaip), obtained from regressing /3_{i }on (Soy, S_{1ji}, S_{2ji})', is equal to 0.52 (delta-method-based 95% CI: [0.18,0.86]). The estimated linear regression equation (8.5) is

FIGURE 8.2

*Prostate Cancer. Individual-level association, as measured by R ^{2}ndiv*(t).

with standard errors of the estimated coefficients Л_{0}-Л_{3} equal to 0.236, 0.336, 0.137, and 0.189, respectively.

The value of Rriai(f), obtained from a linear regression model weighted by the group-specific sample size, is equal to 0.61 (delta-method-based 95% CI: [0.31,0.91]). The estimated linear regression equation (8.5) is

*вг* = 0.342 + 0.656 x ao,i + 0.422 x aq,j + 0.663 x «2*,г,*

with standard errors of the estimated coefficients Л_{0}-Л_{3} equal to 0.186, 0.263, 0.128, and 0.158, respectively.

Neither of the R_{r}i_{a}i(f) values is large enough to allow concluding that PSA measurements can be considered a valid surrogate for OS.