©
2008 Royal Statistical Society 0964–1998/08/171299
J.R.Statist.Soc.
A (2008)
171
,
Part
1
,
pp.
299–308
A joint latent class changepoint model to improvethe prediction of time to graft failure
Francisca Galindo Garre,
TNO Quality of Life, Leiden, The Netherlands
Aeilko H.Zwinderman and Ronald B.Geskus
University of Amsterdam, The Netherlands
andYvo W.J.Sijpkens
Leiden University Medical Center, The Netherlands
[Received October 2006.Revised July 2007]
Summary.
The reciprocal of serum creatinine concentration, RC, is often used as a biomarkerto monitor renal function. It has been observed that RC trajectories remain relatively stableafter transplantation until a certain moment, when an irreversible decrease in the RC levelsoccurs.This decreasing trend commonly precedes failure of a graft.Two subsets of individualscan be distinguished according to their RC trajectories: a subset of individuals having stableRC levels and a subset of individuals who present an irrevocable decrease in their RC levels.To describe such data, the paper proposes a joint latent class model for longitudinal and survival data with two latent classes. RC trajectories within latent class one are modelled by aninterceptonly randomeffects model and RC trajectories within latent class two are modelledby a segmented random changepoint model.A Bayesian approach is used to ﬁt this joint modelto data from patients who had their ﬁrst kidney transplantation in the Leiden University Medical Center between 1983 and 2002.The resulting model describes the kidney transplantationdata very well and provides better predictions of the time to failure than other joint and survivalmodels.
Keywords
: Changepoint; Joint models; Kidney failure; Latent class; Predictions
1. Introduction
Many clinical studies have evaluated longterm success of renal transplantation (Paul, 1999;de Bruijne
et al.
, 2003). These studies generally involve a collection of repeatedly measuredmarker data and an observation on a possibly censored time to graft failure. Since graft failureis usually preceded by chronic transplant dysfunction, repeated measurements of the reciprocalof serum creatinine concentration, RC, are commonly monitored to evaluate renal function. Asan illustration, Fig. 1 presents longitudinal RC trajectories for four patients who were treatedat Leiden University Medical Center. The horizontal axis represents time measured in months,with time starting 6 months after kidney transplantation, and the vertical axis represents 1000times the reciprocal of serum creatinine concentrations measured in micromoles per litre. A fullvertical line in Fig. 1 indicates that the patient has suffered a graft failure, and a broken verticalline indicates that a patient has been censored at this point. Note that for the two patients with
Address for correspondence
: Francisca Galindo Garre, TNO Quality of Life, PO Box 2215, 2301 CE Leiden,The Netherlands.
Email: francisca.galindogarre@tno.nl
300
F.Galindo Garre, A.H.Zwinderman, R.B.Geskus andY.W.J.Sijpkens
05101520051015200
(a)(c) (d)
50 100 1500 50 100 150month
d a t a
(b)
Fig. 1.
RC trajectories for four selected cases: (a) patient 1; (b) patient 2; (c) patient 3; (d) patient 4
graftfailureasuddenchangeintheRCtrajectoryoccurs.Detectionofchangesinrenalfunctionmay provide essential information to predict the start of dialysis.Several models have been proposed for jointly modelling both longitudinal and survivaldata (e.g. Faucett and Thomas (1996) and Wulfsohn and Tsiatis (1997)). These models commonly assume a linear mixed effects model for the longitudinal process, and a Cox regressionmodel for the hazard rate. Although such joint models provide better estimates of the hazardrate than the Cox model with measured timedependent covariates, simple linear mixedeffects models are not always appropriate to describe the longitudinal process. In acquiredimmune deﬁciency syndrome clinical trials, for example, CD4 cell counts are usually monitoredto assess immunological health of human immunodeﬁciency virus patients. This biomarker hashigh variability within patients, and more ﬂexible mixed effects models are needed to describeits trajectory. Wang and Taylor (2001) proposed a mixed effects model with an integratedOrnstein–Uhlenbeck process that allows for interrelationships between consecutive measurements of CD4 cell counts to predict time to event in acquired immune deﬁciency syndromeclinical trials, and Brown
et al.
(2005) proposed a mixed effects model that includes cubic
B
splines to describe CD4 cell counts and viral load trajectories. In the latter model, boththe number of knots and the location of the knots must be chosen in advance. Linear mixedeffects models are also not appropriate to describe RC trajectories. A better approach consistsofestimatingasegmentedrandomeffectsmodelwithonlytwosegmentsseparatedbyachangepoint to be estimated for each person. Randomeffects models with a changepoint were ﬁrstproposed by Carlin
et al.
(1992) and were ﬁrst used in the context of joint models by Pauler andFinkelstein (2002) to describe prostatespeciﬁc antigen trajectories in prostate cancer patients.
Prediction of Time to Graft Failure
301
Aparticularfeatureofourapplicationisthatonlyasubsetofthepatientsexperienceachangepoint whereas RC trajectories remain stable for the rest of the patients. Pauler and Finkelstein(2002) also noted the presence of these two subsets within the prostate cancer patients, butthey did not take it into account explicitly in their model. Instead they assumed that accurateestimates of the changepoint would only be reached if the data clearly indicate the existenceof a changepoint. The presence of patient subsets was explicitly distinguished, however, in themodel that was proposed by Pauler and Laird (2000) to identify subjects who switch regimesduring a clinical trial, and in the latent class joint model that was proposed by Lin
et al.
(2002)to describe heterogeneity in prostatespeciﬁc antigen trajectories in subpopulations of prostatecancer patients.The approach that is presented in this paper can be viewed as a combination of the modelsthatwereproposedbyPaulerandLaird(2000)andbyLin
etal.
(2002).WesuggestajointmodelinwhichRCtrajectoriesaredescribedbymeansofalatentclassmodelwithtwolatentclasses.Inthe ﬁrst latent class, RC trajectories are modelled with an interceptonly randomeffects model,and, in the second latent class, RC trajectories are modelled with a segmented randomeffectsmodel. A description of this joint model for longitudinal and timetoevent data can be foundin Section 3. A Bayesian approach is used to estimate the posterior distribution of the modelparameters. Results from the resulting model applied to the kidney transplantation data aredescribed in Section 4. The paper ends with a discussion.
2. Longitudinal kidney data
The model will be illustrated with data from 698 patients who had their ﬁrst kidney transplantation in the Leiden University Medical Center between January 1st, 1983, and January 1st,2002, and who had a functioning kidney for at least 6 months after transplantation. A detaileddescription of these data can be found in de Bruijne
et al.
(2003). Since we were interested in thestudy of longterm success of kidney transplantion, serum creatinine levels were collected foreach patient at unspeciﬁed time points beyond 6 months after transplantation. As is commonlyused in practice to assess renal function, 1000 times the reciprocal of the serum creatinine concentration, RC, was used in the analyses. The mean number of recordings per patient was 76(range 2–294) values. Late graft failure is deﬁned as a return to dialysis. Patients who died withafunctioninggraftarecountedasnonfailures.InadditiontoRC,severaltimeinvariantcovariates were evaluated as risk factors of graft failure. These covariates were recipient age, recipientpanelreactiveantibodies,PRA,thelevelofcrossreactivegroupsthataresharedbetweendonorand recipient, CREG, and the number of treated acute rejection episodes. The covariates werechosenbecausetheyreachedsigniﬁcanteffectsintheanalysesthatwereperformedbydeBruijne
et al.
(2003).
3. The joint latent class changepoint model
In this section the joint model for the RC trajectories and graft failure outcomes is presented.For the RC trajectories, a twolatentclass model is proposed, where each latent class has itsown randomeffects model. For the survival outcomes, a Cox regression model is used, with adifferent baseline hazard for each latent class. The joint model will be estimated under a Bayesian approach. Marginal posterior distributions of the parameters of interest will be obtainedwith the statistical package WinBUGS 1.4.1 (Spiegelhalter
et al.
, 2003). Next, we describe theloglikelihood function and the prior functions that are used in this paper.
302
F.Galindo Garre, A.H.Zwinderman, R.B.Geskus andY.W.J.Sijpkens
Suppose that a sample of
N
subjects is drawn with observations at time
t
ij
, with
j
=
1,...,
m
i
and
m
i
denoting the number of observations for subject
i
. Let
Y
i
.t
ij
/
denote 1000 times thereciprocal of the serum creatinine concentration at time
t
ij
and
Y
i
be the complete vector of observations for subject
i
. Each subject is assumed to belong to one of
k
=
1,2 latent classesand
L
ik
is an unobserved variable that indicates to which latent class
k
subject
i
belongs.
L
ik
is 1 if subject
i
belongs to latent class
k
and 0 otherwise. It is assumed that
.
Y
i

L
ik
=
1
/
hasthe
k
th component distribution
f
k
.
Y
i

θ
ik
/
, which is a normal distribution that depends on asubject and classspeciﬁc parameter vector
θ
ki
. Denote by
π
k
the proportion of the populationbelonging to latent class
k
, which satisﬁes
Σ
2
k
=
1
π
k
=
1. The joint distribution of the observeddata and the unobserved indicators conditional on the model parameters (Gelman
et al.
, 2003)can be written as
p.
Y
,
L

θ
,
π
/
=
p.L

π
/p.
Y

L
,
θ
/
=
N
i
=
12
k
=
1
{
π
k
f
k
.
Y
i

θ
ik
/
}
L
ik
: .
1
/
3.1. Submodels for kidney transplant recipients
Inthissubsectionthetwocompetingindividuallevelmodelsthatareusedwithineachlatentclassare described. The ﬁrst model is an interceptonly randomeffects model for patients whose kidney function remains stable over time. The second model is a segmented randomeffects modelwith a constant level before a random changepoint and a linearly declining trend thereafter, forpatients who suffer an irreversible kidney dysfunction. The value for subject
i
at time
t
ij
is givenby
Y
i
.t
ij
/
∼
N
{
µ
i
.t
ij
/
,
τ
2
"
}
,with
.
µ
i
.t
ij
/

L
i
1
/
=
a
1
i
if individual
i
belongs to subgroup 1, and
.
µ
i
.t
ij
/

L
i
2
/
=
a
2
i
−
b
+
i
.t
ij
−
c
i
/
+
if individual
i
belongs to subgroup 2. Here,
a
ki
,
b
i
and
c
i
are random effects and
τ
2
"
denotes therandomerror variance.
a
ki
denotes the subjectspeciﬁc intercept in latent class
k
. For the secondmodel,
b
i
denotesthesubjectspeciﬁcslopeaftertherandomchangepoint,and
c
i
representsthe time at which the irreversible kidney dysfunction starts. The superscript ‘+’ is an indicatorfunction with
z
+
=
z
for
z>
0 and
z
+
=
0 otherwise.It is assumed that
a
1
i
follows a normal distribution
N.
α
,
σ
2
α
/
, the vector
.a
2
i
,
b
i
/
follows abivariate normal distribution
N.
µ
,
σ
2
µ
/
and
c
i
follows a normal distribution
N.
γ
,
σ
2
γ
/
, which istruncated below at 0. Minimum informative priors are used for the hyperparameters. A normalprior distribution
N.
0,500
/
is chosen for
α
and
µ
1
, a normal prior distribution
N.
1,200
/
ischosen for
µ
2
and an
N.
50,200
/
distribution is chosen for
γ
, with 50 being the average numberof observed months per patient. An inverse Wishart prior with location matrix
1 0
:
0050
:
005 1
and degrees of freedom
p
=
4 is chosen for
σ
2
µ
, and inverse gamma(0.01,0.01) distributions arechosen for
τ
2
"
,
σ
2
α
and
σ
2
γ
. Finally, each unobserved vector
L
i
=
.L
i
1
,
L
i
2
/
is regarded as a binomial random variable with parameters
π
, whose natural conjugate prior distribution is a ﬂatbeta distribution with parameters equal to
.
1,1
/
.
Prediction of Time to Graft Failure
303
3.2. Submodel for time to chronic rejection
For the survival time data, a Cox proportional hazards model is assumed with a different baselinehazard
λ
0
k
.t/
foreachlatentclass.deBruijne
etal.
(2003)showedthattheCoxproportionalhazards model is a suitable model to describe the kidney transplantation data. Let
F
i
denotethe failure time for subject
i
. Since failure time can be right censored, let
C
i
be the censoringtime for subject
i
. Data from
.T
i
,
D
i
/
were observed, where
T
i
=
min
.F
i
,
C
i
/
, and
D
i
is the failureindicator,whichtakesthevalueof1ifthefailureisobservedandof0otherwise.TheCoxmodelspeciﬁes that the hazard of failing at time
t
for a subject
i
who belongs to subpopulation
k
isequal to
λ
i
.t

L
ik
=
1
/
=
λ
0
k
.t/
exp[
υ
k
{
µ
i
.t/

L
ik
=
1
}
+
ζ
w
i
],where
υ
k
is the parameter linking the trajectories in latent class
k
to the hazard function and
ζ
is a parameter vector linking a vector
w
i
of baseline covariates for subject
i
to the hazard function. All these parameter vectors were assumed to have minimum informative normalpriors with zero mean and large variance
.
σ
2
=
1000
/
. Finally, we adopted independent inversegamma(0.01,0.01)priorsforthebaselinehazards
λ
0
k
.t/
.Theapplicationofthismodeltokidneytransplantation data will be illustrated in the next section.
3.3. Predicting time to event
Suppose that a new sequence of serum creatinine measurements is available for patient
i
witha kidney transplant. On the basis of this information we can then obtain the probability of agraft failure at a certain moment during the followup. A method for doing this is based oncalculating the posterior predictive survival function
S
i
to
t
+
∆
t
, given the value of
Y
i
.t/
at time
t
and the fact that the patient has survived to time
t
:
S
i
{
t
+
∆
t

T
t
,
Y
i
.t/
}
=
2
k
=
1
P.L
ik
=
1

π
/P
i
{
t
+
∆
t

T
t
,
Y
i
.t/
,
L
i
=
k
}
,
.
2
/
which is a mixture of survival distributions for patient
i
conditional on group
k
. The survivalfunction for patient
i
in group
k
is given by
P
i
{
t
+
∆
t

T
t
,
Y
i
.t/
,
L
i
=
1
}
=
exp
−
u
=
t
+
∆
t
u
=
0
λ
0
k
.u/
exp[
υ
k
{
µ
i
.t/

θ
ik
,
L
ik
=
1
}
+
ζ
w
i
]
f
{
θ
ik

Y
i
.t/
,
T
t
}
d
θ
ik
exp
−
u
=
t
u
=
0
λ
0
k
.u/
exp[
υ
k
{
µ
i
.t/

θ
ki
,
L
ik
=
1
}
+
ζ
w
i
]
f
{
θ
ik

Y
i
.t/
,
T
t
}
d
θ
ik
,where
θ
ik
denotes a subject and classspeciﬁc parameter vector. This integral is approximatedby using Monte Carlo simulation.Equation (2) is similar to the survival function of the cure rate models (see Ibrahim
et al.
(2001), chapter 5). In the latter models, the survival function for cured patients is 1 becausethese patients will never experience the event. We cannot talk about a cured group in our case,however, because observations of RC decline are not available when patients suffer an acute RCdecline which occurs so shortly before graft rejection that it is not observed. This is why ourmodel includes a survival function for the patients with a stable RC trajectory as well.
4. Application to kidney transplant data
4.1. Estimates for the joint model
We now describe the analysis of the kidney transplant data with the joint model that was proposedinSection3.UsingWinBUGS1.4.1(Spiegelhalter
etal.
,2003),twoMarkovchainMonte