## Survival Model With Change Point in Both Hazard and Regression Parameters

Dupuy Jean-François

Laboratoire de Statistique et Probabilités, Université Paul Sabatier, 118, route de Narbonne, 31062 Toulouse cedex 4, France [email protected]

### 1 Introduction

In this paper, we consider a parametric survival regression model with a change-point in both hazard and regression parameters. Change-point occurs at an unknown time point. Estimators of the change-point, hazard and regression parameters are proposed and shown to be consistent.

Let T be a random failure time variable. The distribution of T is usually specified by the hazard function A = f/1 - F where f and F are the density and distribution functions of T respectively. Change in hazard at an unknown time point has been extensively studied. Such a change may occur in medical studies after a major operation (e.g. bone marrow transplant) or in reliability (e.g. change of failure rate following temperature increase). Several authors have considered the following change-point hazard model:

with a > 0, a + 0 > 0, and where t > 0 is an unknown change-point time assumed to lie in a known interval [t1,T2] such that 0 < T1 < T2 < œ (see [CCH94], [MFP85], [MW94], [NRW84], [PN90]). Wu et al. (2003) [WZW03] extend (1) to the hazard model where Ao(-; y) is a baseline hazard function depending on an unknown parameter y. This model allows hazard to be nonconstant anterior or posterior to change-point. In this paper, we extend (1) in a different way: we allow A(-) to vary among individuals by incorporating covariates in the change-point model (1). Moreover, since the effect of covariates (e.g. age of a patient) may also change at the unknown time point t, we allow for change in the regression parameter at t (e.g. the risk of death of elderlies compared to young patients may increase after a major surgical operation). We specify the following hazard model:

X(t\Z) = (a + 91{> })exp{(i3 + y 1{>t })t Z}, (3)

where a > 0, a + 9 > 0, t is an unknown change-point time, 3 and y are unknown regression coefficients. Previous work on change-point regression models has mainly focused on linear models (we refer to [CK94], [Hor95], [HHS97], [Jar03], [KQS03]). A detailed treatment and numerous references can be found in [CH97]. Gurevich and Vexler [GV05] consider change-point problem in the logistic regression model. Luo et al. [LTC97] and Pons ([Pon02], [Pon03]) consider a Cox model involving a change-point in the regression parameter.

In Section 2, we give a brief review of recent results on model (2), and we construct estimators for model (3). In Section 3, we prove that these estimators are consistent. Technical details are given in appendix.

2 Notations and construction of the estimators 2.1 Preliminaries

We consider a sample of n subjects observed in the time interval [0, Z]. Let To be the survival time of the ith individual and Zi be the related covariate. Zi is assumed to be a ^-dimensional random variable. Suppose that Zi is bounded and var(Zi) > 0. We assume that To may be right censored at a noninformative censoring time Ci such that Ci and Tio are independent conditionally on Zi. For individual i, let Ti = To A Ci be the observed time and Ai = 1{T0<Ci} be the censoring indicator.

The data consist of n independent triplets Xi = (Ti, Ai,Zi), i = 1,...,n.

For model (2), which has no covariates, [WZW03] follow [CCH94] and define Yn(t) as

where 0 < p < 1, and ANA(t) is the Nelson-Aalen estimator of A(t) = fo X(s) ds, ylo(t) = Jo Xo(s;Yn) ds estimates Ao(t) = fo Xo(s; Yo) ds, Yn is a consistent estimator of the true Yo, and Z > T2 is a finite time point such that P (T > Z) > 0. The asymptotic version of Yn is

Wu et al. [WZW03] remark that if the true 9o is strictly positive, then Y(t) is increasing on [0,t] and decreasing on [t, Z], hence they define an estimator TYn of T by

p where Yn(t±) is the right or left-hand limit of Yn(■) at t. If 00 < 0, Y(t) is decreasing on [0,r] and increasing on [t, £]. In this case, Wu et al. (2003) define

Wu et al. [WZW03] show the following theorems under some regularity conditions:

Theorem 1. The estimator Tn of t defined in (4) or (5) is consistent.

Let ln(a, 9, t, y) denote the loglikelihood based on (Ti,Ai) (i = 1,^^,n), and an(T,j) and 9n(T,j) be respectively the solutions of dln(a,9,T,j)/da = 0 and dln(a,9,T,Y)/d9 = 0 for given (t,y).

Theorem 2. an(Tn,Yn) and 9n(Tn,Yn) are consistent estimators of a and 9 respectively.

In this paper, a different approach is taken to prove consistency of estimators in the model (3). It relies on modern empirical process theory as exposed in [Van98] and [VW96].

2.2 The estimators

We consider the statistical model defined by the family of densities pv(X) = {ae^Z}A exp (-ae?TZt) 1{T<t} + {(a + 9)e^<)TZ

x exp (-ae^Zt - (a + 9)e(3+Y)TZ(T - t)) 1{t>t}, where p = (t,£t)t, with £ = (a,9,3T,yt)t. Here a, 9, and the regression parameters ¡3 and y belong respectively to bounded subsets A C R+\{0}, B C R\{0}, C C Rq, and D C Rq\{0}. The change-point t is a parameter lying in the open interval ]0, £[. Let po = (tq, £t)t be the true parameter value, lying in & =]0,£[xA x B x C x D. We suppose that p0 is such that 90 =0 and y0 = 0, so that a change-point actually occurs. We suppose also that a0 + 90 > 0.

Under the true parameter values, we denote Pq = PV0 the probability distribution of the variables (T0, Ci,Zi) and Eo the expectation of the random variables.

The log-likelihood function based on the observations Xi (i = is ln(p) = Y Ni(T)[lna + 3tZi] - 1{T,<T}ae^TZiTi + [Ni(^) - N(t)]

iKn x [ln(a + d) + 03 + y)TZi] — 1{T>} + (a + 0)e(P+Y)TZi (Ti — t)]} , ae^ zi t where Ni(t) = Ai1{Ti<t} is the counting process for death of individual i. The estimator <Tn is obtained as follows: for a fixed t, we let £n(T) be the value of £ which maximizes the log-likelihood ln(p). Then to is estimated by Tn which satisfies the relationship

Tn = inf < T e]0,C[: max(ln(T,£n(T)),ln(T +,£n(T +))) = SUp ln(T,£n(T)) > , [ re]0,C[ J

where ln(T + ,£n(T +)) is the right-hand limit of ln at t. Then the maximum likelihood estimator of £ is obtained as Tn = Tn(Tn).

For a given t, we estimate a, d, f3 and 7 by considering the following score functions:

e {[Ni(to) - Ni(T)] Zi - 1{t>t}(a + e)Zie(P+Y)TZi (Ti - t)} .

3 Convergence of the estimators

Our main result is

Theorem 3. The estimators Tn and Tn converge in probability to to and £o.

Proof. The proof of consistency is based on the uniform convergence of Xn(p) = n-1(ln(^>) - ln(<p0)) to a function having a unique maximum at ^0. Two lemmas will be needed, their proofs are given in appendix. By some rearranging, the process Xn = n-1(ln - ln(^o)) can be written as a sum Xn = X1,n + X2,n + Xs<n + X4,n according to the sign of t - to, with

Xltn(v) = n-iJ2 Al{Ti<rAT0 Jln a + (/? — Po)TzA , i<n ^ '

X2,n(v) = n-i V Ail{Ti>TVT0 J ln ^^ + (/? + 7 — ^o — 7o)TZA , ^ { ao + Po J

X3,n(^) = n-i £ Ail{T<T<To}{ln aa+P + (/? + 7 — ^o)TZi |

+n-i V 4il{T0<Ti<T} (ln " + (/? — ^o — 7o)TzA , < 1 ao +Po J

X4,n(^) = n-i £ {l^^aoe^ZiTi — 1{T<t}a.e/TZiTi

+l{Ti>To} [(ao + Po)e(^0+70)TZi (Ti — to) + a.oe/TZito] } .

Let Xbe the function defined as = Xi,TO(^)+X2,TO(^) + X3,TO(^) +

X4,ro(^) = Eo 1{t <T0}aoe^ Z T — 1{t <T }ae^ Z T

+1{t>t0} [(ao + Po)e(^0 +Y0)TZ(T — to) + aoe#Zto

The first lemma asserts uniform convergence of Xn to XTO.

Lemma 1. supve^ \Xn(^>) — XTO(^)\ converges in probability to 0 as n ^ to.

Uniqueness of ^>o as a maximizer of Xcomes from the following lemma, which asserts that the model is identifiable.

Note that is minus the Kullback-Leibler divergence of pv and pV0.

Since XTO(^o) = 0, ^o is a point of maximum of XTO. Moreover, it is a unique point of maximum since po is identifiable (Lemma 2). As Xn converges uniformly to Xx (Lemma 1), it follows that pn converges in probability to po.

Remark The hazard function (3) specifies a multiplicative hazard model. An alternative formulation for the association between covariates and time-to-event is the additive hazard model, where X(t\Z) = a + ftTZ (a semiparametric form may be specified by letting a be an unknown function of time). The results obtained for model (3) can be shown to hold for the additive change-point hazard regression model

Proofs proceed along the same line as described above.

### Appendix

Proof of Lemma 1. Writing Xn(p) as Xn(p) = n-iJ2fv(Xi), the uniform convergence stated by this lemma is equivalent to the class of functions F = {fv : p e <P} being Glivenko-Cantelli (we refer the reader to [Van98] and [VW96] for definition of Glivenko-Cantelli and Donsker classes, and for many useful results on these classes).

Since every Donsker class is also Glivenko-Cantelli, we show that F is Donsker by using results from empirical process theory [VW96]. To demonstrate how it works, we shall show that

[gv(T, A, Z) = 1{T>T}(a + 0)c(^+y)TZT : p e is Donsker. The set of all indicators functions 1{(T,is Donsker. From Theorem 2.10.1 of [VW96], {1{(t,x)} : t e]0,C[} is Donsker. The function hv : (T, A, Z) ^ T is bounded, which implies that {1{(T,^)}T : t e]0, C[} is Donsker (see Exemple 2.10.10 of [VW96]). The class {a + 0 : a e A, 9 e B} is Donsker. By multiplying two Donsker classes, we get that {1{(T,TO)}(a + 0)T : t e]0,£[,a e A,0 e B} is Donsker. Similarly, boundedness of Z implies that {(ft + y)tZ : ¡3 e C,y e D} is Donsker. The exponential function is Lipschitz on compact sets of the real line, then we get from [VW96] (Theorem 2.10.6) that the class {e(^+Y)TZ : ft e C,y e D} is Donsker. Again, by multiplication of two Donsker classes, we get that {gv(T, A, Z) = 1{T>T}(a + 0)e(^+Y)TZT : p e is Donsker. Using similar arguments and the fact that the sum of two Donsker classes is Donsker, we finally get that F is Donsker, and hence Glivenko-Cantelli.

Proof of Lemma 2. Considering the densities pv (x) = pv0 (x) on S = 0, we see that exp(-ae3 zt)l{t<T} + exp(-ae3 zt

for almost all (t,z). Suppose that t = To (we suppose that t < To, the symetric case t > To can be treated similarly).

Let £2z be the set of t G (t, To] such that (t,z) does not satisfy (6). Then, for almost all z (i.e. outside an exceptional zero-measure set Z), Po(Qz) = 0. Given that zi = z2 and that neither is in Z, (t,zi) and (t,z2) satisfy the above relation for almost all t G (t,to]. In particular, let ti = t2 be two such values.

Evaluated at (ti,zi), (6) becomes exp(-ae?Tzit - (a + 0)e(3+Y)Tzi (ti - t)) = exp(-aoe3Tziti), which is equivalent to ae3Tzi t +(a + 0)e(3+1')Tzi (t1 - t) = aoe3'°zi t1. Similarly, for (t2,z1) we obtain aefTzit + (a + 0)e(3+Y)Tzi (t2 - t) = aoe3Tzit2. By substracting these last two equalites, we obtain (a + 0')e(3+l) zi (t2 - ti) = aoe30Tzi (t2 - ti), which implies

since ti = t2. The same reasonment for the couples (ti,z2), (t2,z2) yields

Since a + 0 = 0 and ao = 0, we can calculate the ratio (8)/(7): e{P+i)T(z2-zi) = [email protected] (z2-zi), which implies (ft + y - Po)T(z2 - zi) = 0. Note that the assumption var(Z) > 0 is necessary to achieve identifiability. If var(Z) = 0, then for any q-dimensional vector a = 0, var(aTZ) = 0. Therefore aT(zi - z2) =0 does not imply that a = 0. Consider all q-vectors that are orthogonal to zi - z2, and select a pair (z3,z4) such that neither is in the zero-measure set Z. Then we get (ft + y - fto)T(z4 - z3) = 0. In this way, we can select q pairs of z such that none of the pairs is in Z, and the differences of the pairs are linearly independent. Thus ft + y - fto =0 and finally ft + y = fto. It follows from (8) that a + 0 = ao.

Now, from (6), ae3 zit + aoe3T zi (ti - t) = aoe30 ziti, which implies aei3 zit = aoe3T zit. Similarly, for the couple (ti,z2), ae3 z2t = aoe3T z2t. By taking the ratio of these two equalities, we obtain e3T(zi-z2) = e33T(zi-z2), which implies (ft - fto)T(zi - z2) = 0. By the same reasonment as above, ft = fto, and hence y = 0. This is a contradiction, hence t < To. Similarly, we can show that t > to. Hence if (6) holds, then t = to.

Using similar arguments, it is now easy to complete the proof and to show that £ =

References

[AG82] Andersen, P.K., Gill, R.D.: Cox's regression model for counting processes: a large sample study. Ann. Statist., 10, 1100-1120 (1982).

[CCH94] Chang, I.-S., Chen, C.-H., Hsiung, C.A.: Estimation in change-point hazard rate models with random censorship. In: Carlstein, E., Müller, H.-G., Siegmund, D. (eds.) Change-point Problems. IMS Lecture Notes-Monograph Ser. 23 (1994).

[CK94] Cohen, A., Kushary, D.: Adaptive and unbiased predictors in a change point regression model. Statist. Prob. Letters, 20, 131-138 (1994).

[CH97] Csörgö, M., Horvath, L.: Limit Theorems in Change-Point Analysis. Wiley, New York (1997).

[GV05] Gurevich, G., Vexler, A.: Change point problems in the model of logistic regression. J. Statist. Plann. Inf., in press (2005).

[Hor95] Horvath, L.: Detecting changes in linear regressions. Statistics, 26, 189-208 (1995).

[HHS97] Horvath, L., Huskova. M, Serbinowska, M.: Estimators for the time of change in linear models. Statistics, 29, 109-130 (1997).

[Jar03] Jaruskova, D.: Asymptotic distribution of a statistic testing a change in simple linear regression with equidistant design. Statist. Prob. Letters, 64, 89-95 (2003).

[KQS03] Koul, H.L., Qian, L., Surgailis, D.: Asymptotics of M-estimators in two-phase linear regression models. Stoch. Proc. Appl., 103, 123153 (2003).

[LTC97] Luo, X., Turnbull, B.W., Clark, L.C.: Likelihood ratio tests for a changepoint with survival data. Biometrika, 84, 555-565 (1997).

[MFP85] Matthews, D.E., Farewell, V.T., Pyke, R.: Asymptotic score-statistic processes and tests for constant hazard against a change-point alternative. Ann. Statist., 13, 583-591 (1985).

[MW94] Müller, H.G., Wang, J.-L.: Change-point models for hazard functions. In: Carlstein, E., Müller, H.-G., Siegmund, D. (eds.) Change-point Problems. IMS Lecture Notes-Monograph Ser. 23 (1994).

[NRW84] Nguyen, H.T., Rogers, G.S., Walker, E.A.: Estimation in change-point hazard rate models. Biometrika, 71, 299-304 (1984).

[PN90] Pham, T.D, Nguyen, H.T.: Strong consistency of the maximum likelihood estimators in the change-point hazard rate model. Statistics, 21, 203-216 (1990).

[Pon02] Pons, O.: Estimation in a Cox regression model with a change-point at an unknown time. Statistics, 36, 101-124 (2002).

[Pon03] Pons, O.: Estimation in a Cox regression model with a change-point according to a threshold in a covariate. Ann. Statist., 31, 442-463. (2003)

[Van98] Van Der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, New York (1998).

[VW96] Van Der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, New York (1996). [WZW03] Wu, C.Q., Zhao, L.C., Wu, Y.H.: Estimation in change-point hazard function models. Statist. Prob. Letters, 63, 41-48 (2003).