[Bro93] Brown,P., Measurement, Regression, and Calibration. Clarendon: Oxford (1993)

[BJ79] J. Buckley and I. James, "Linear regression with censored data," Biometrika vol. 66, pp. 429-436, 1979.

[BD00] N. Butler and M. Denham, "The peculiar shrinkage properties of partial least squares regression," J. Roy. Stat. Soc., Ser. B vol. 62, pp. 585-593, 2000.

[CC96] Collier, A., Coombs, R., Schoenfeld, D., Bassett, R., Timpone, J., Baruch, A., Jones, M., Facey, K., Whitacre, C., McAuliffe, V., Friedman, H., Merigan, T., Reichman, R., Hopper, C., Corey L.: Treatment of hu man immunodeficiency virus infection with saquinavir, zidovudine, and zal-citabine: AIDS Clinical Trial Group. N. Engl. J. Med. 16, 1011-1017 (1996)

[CS95] J. Condra, W. Schleif, O. Blahy, L. Gabryelski, D. Graham, J. Quintero, A. Rhodes, H. Robbins, E. Roth, M. Shivaprakash, D. Titus, T. Yang, H. Tepplert, K. Squires, P. Deutsch and E. Emini, "In vivo emergence of HIV-I variants resistant to multiple protease inhibitors," Nature vol. 374, pp. 569-571, 1995.

[CH96] J. Condra, D. Holder, W. Schleif, and et al., "Genetic correlates of in vivo viral resistance to indinavir, a human immunodeficiency virus type I protease inhibitor," J. Virol. vol. 70, 8270-8276, 1996.

[Cox72] D. Cox, "Regression models and life tables," J. Roy. Stat. Soc., Ser. B vol. 34, pp. 187-220, 1972.

[Cur96] I. Currie, "A note on Buckley-James estimators for censored data," Biometrika vol. 83, pp. 912-915, 1996.

[dej93] S. de Jong, "SIMPLS: an alternative approach to partial least squares regression," Chem. Intell. Lab. Syst. vol. 18, pp. 251-263, 1993.

[Den91] M. Denham, Calibration in infrared spectroscopy, Ph.D. Dissertaion, University of Liverpool, 1991.

[DS81] N. Draper and H. Smith, Applied Regression Analysis, John Wiley and Sons: New York, 1981.

[Fra93] I. Frank and J. Friedman, "A statistical view of some chemometrics regression tools," Technometrics vol. 35, pp. 109-134, 1993.

[Gou96] C. Goutis, "Partial least squares algorithm yields shrinkage estimators," Ann. Stat. vol. 24, pp. 816-824, 1996.

"Regression analysis with multicollinear predictor variables: Definition, detection, and effects," Commun. Stat. Theo. Meth. vol. 12, pp. 2217-2260, 1983.

[Hel88] I. Helland, "On the structure of partial least squares regression," Commun. Stat. Simu. Comp. vol. 17, pp. 581-607, 1988.

[HS92] G. Heller and J. Simonoff, "Prediction in censored survival data: a comparison of the proportional hazards and linear regression models," Biometrics vol. 48, pp. 101-115, 1992.

[Hoc76] R. Hocking, "The analysis and selection of variables in linear regression," Biometrics vol. 32, pp. 1-49, 1976.

[Hot33] H. Hotelling, "Analysis of a complex of statistical variables into principal components," J. Educ. Psychol. vol. 24, pp. 417-441, 498-520, 1933.

[Hug99] J. Hughes, "Mixed effects models with censored data with applications to HIV RNA levels," Biometrics vol 55, pp. 625-629, 1999.

[HH05] J. Huang and D. Harrington, "Iterative partial least squares with right-censored data analysis: A comparision to other dimension reduction technique," Biometrics, in press, March 2005.

[HH04] J. Huang and D. Harrington, "Dimension reduction in the linear model for right-censored data: predicting the change of HIV-I RNA levels using clinical and protease gene mutation data," Lifetime Data Analysis, in press, December 2004.

[JHO96] H. Jacobsen, M. Hanggi, M. Ott, I. Duncan, S. Owen, M. Andreoni, S. Vella, and J. Mous, "In vivo resistance to a human immunodeficiency virus type I protease inhibitor: mutations, kinetics, and frequencies," J. Inf. Dis. vol. 173, pp. 1379-1387, 1996.

[JT00] H. Jacqmin-Gadda and R. Thiebaut, "Analysis of left censored longitudinal data with application to viral load in HIV infection," Biostatistics vol. 1, pp. 355-368, 2000.

[JLW03] Z. Jin, D. Lin, L. Wei, and Z. Ying, "Rank-based inference for the accelerated failure time model," Biometrika vol. 90, pp. 341-353, 2003.

[Jol86] I. Jolliffe, Principal Component Analysis, Springer-Verlag: New York, 1986.

[LW82] N. Laird and J. Ware, "Random effects models for longitudinal data," Biometrics vol. 38, pp. 963-974, 1982.

[Mar99] I. Marschner, R. Betensky, V. Degruttola, S. Hammer, and D. Ku-ritzkes, "Clinical trials using HIV-1 RNA-based primary endpoints: statistical analysis and potential biases,"T J. Acq. Imm. Def. Syndr. Hum. Retr. vol. 20, pp. 220U227, 1999.

[Mil90] A. Miller, Subset Selection in Regression, Chapman and Hall: London, 1990.

[MH82] R. Miller and J. Halpern, "Regression with censored data," Biometrika vol. 69, pp. 521-531, 1982.

[NR02] D. Nguyen and D. Rocke, "Partial least squares proportional hazard regression for application to DNA microarray survival data," Bioinformat-ics vol. 18, pp. 1625-1632, 2002.

[PG00] M. Para, D. Glidden, R. Coombs, A. Collier, J. Condra, C. Craig, R. Bassett, S. Leavitt, V. McAuliffe, and C. Roucher, "Baseline human immunodeficiency virus type I phenotype, genotype, and RNA response after switching from long-term hard-capsule saquinavir to indinavir or soft-gel-capsule in AIDS clinical trials group protocol 333," J. Inf. Dis. vol. 182, pp. 733-743, 2000.

[PT02] P. Park, L. Tian and I. Kohane, "Linking gene expression data with patient survival times using partial least squares," Bioinformatics vol. 18, pp. S120-S127, 2002.

[SB90] M. Stone and R. Brooks, "Continuum regression: cross-validation sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression," J. Roy. Stat. Soc, Ser. B vol. 52, pp. 237-269, 1990.

[Tib96] R. Tibshirani, "Regression shrinkage and selection via the lasso," J. Roy. Stat. Soc., Ser. B vol. 58, pp. 267-288, 1996.

[Tsi90] A. Tsiatis, "Estimation regression parameters using linear rank tests for censored data model with censored data," Ann. Stat. vol. 18, pp. 354372, 1990.

[VIS99] M. Vaillancourt, R. Irlbeck, T. Smith, R. Coombs, and R. Swanstrom, "The HIV type I protease inhibitor saquinavir can select for multiple mutations that confer increasing resistance," AIDS Res. Hum. Retr. vol. 15, pp. 355-363, 1999.

[WM03] P. Wentzell and L. Montoto, "Comparison of principal components regression and partial least squares through generic simulations of complex mixtures," Chem. Intell. Lab. Syst. vol. 65, pp. 257-279, 2003.

[Wol66] H. Wold, "Nonlinear estimation by iterative least squares procedures," Research papers in Statistics: Festschrift for J. Neyman John Wiley and Sons: New York, pp. 411-444, 1966.

[Wol76] H. Wold, "Soft modeling by latent variables: The non-linear iterative partial least squares (NIPALS) approach," Perspectives in Probability and Statistics, In Honor of M. S. Bartlett Academic: New York, pp. 117-144, 1976.

[Wol84] S. Wold, H. Wold, W. Dunn, and A. Ruhe, "The collinearity problem in linear regression: The partial least squares (PLS) approach to generalized inverse," SIAM J. Sci. Stat. Comput. vol. 5, pp. 735-743, 1984.

[YWL92] Z. Ying, L. Wei, and D. Lin, "Prediction of survival probability based on a linear regression model," Biometrika vol. 79, pp. 205-209, 1992.

Inference for a general semi-Markov model and a sub-model for independent competing risks

Catherine Huber-Carol1, Odile Pons2, and Natacha Heutte3

1 University Paris 5, 45 rue des Saints-Pères, 75270 Paris Cedex 06, France and U 472 INSERM, 16bis avenue P-V Couturier, 94 800, Villejuif, France [email protected]

2 INRA Applied Mathematics and Informatics, 78352 Jouy-en-Josas Cedex, France [email protected]

3 IUT de Caen, Antenne de Lisieux, Statistique et Traitement Informatique des Données. 11, boulevard Jules Ferry 14100 Lisieux, France [email protected]

The motivation for this paper is the analysis of a cohort of patients where not only the survival time of the patients but also a finite number of life states are under study. The behavior of the process is assumed to be semi-Markov in order to weaken the very often used, and often too restrictive, Markov assumption. The behavior of such a process is defined through the initial probabilities on the set of possible states, and the transition functions defined as the probabilities, starting from any specified state, to reach another state within a certain amount of time. In order to define this behavior, the set of the transition functions may be replaced by two sets. The first one is the set of direct transition probabilities pjj/ from any state j to any other state j'. The second one is the set of the sojourn times distributions F\jj/ as functions of the actual state j and the state j' reached from there at the end of the sojourn (section 2).

The most usual model in this framework is the so-called competing risk model. This model may be viewed as one where, starting in a specific state j, all states that may be reached directly from j are in competition: the state j' with the smallest random time Wjj/ to reach it from j will be the one. It is well known that the joint distribution and the marginal distribution of the latent sojourn times Wjj/ is not identifiable in a general competing risk model [TSI75]. In a semi-Markov model as well as in a competing risk model, only the sub-distribution functions Fj/\j = pjj/ F\jj/ are identifiable and it is always possible to define an independent competing risk (ICR) model by assuming that the variables Wjj/, j' = 1,...,m, are independent with distributions F\jj/ = Fj/\j/Fj/\j(to). Without an assumption about their dependence, their joint distribution is not identifiable and a test of an ICR model against an alternative of a general competing risk model is not possible. Similarly, there is always a representation of any general semi-Markov model as a competing risk model with possibly dependent Wjj' but it is not uniquely defined. When the random variables Wjj', j' G J(j), are assumed to be independent, the semi-Markov model simplifies : the transition probabilities can be deduced from the laws of the sojourn times Wjj' (section 3). As the term "competing risk" is also used in case of dependence of the Wjj', we shall emphasize the independence we always assume in a competing risk model, by denoting it the ICR model (Independent Competing Risk model).

For a general right-censored semi-Markov process, Lagakos, Sommer and Zelen [LSZ78] proposed a maximum likelihood estimator for the direct transition probabilities and the distribution functions of the sojourn times, under the assumption of a discrete function with a finite number of jumps. In non-parametric models for censored counting processes, Gill [GILL80], Voelkel and Crowley [VC84] considered estimators of the sub-distribution functions Fj'\j = Pjj' F\jj> and they studied their asymptotic behavior. Here, we consider maximum likelihood estimation for the general semi-parametric model defined by the probabilities pjj' and the hazard functions related to the distribution functions F\jj' (section 4). If the mean number of transitions by an individual tends to infinity, then, the maximum likelihood estimators are asymptotically equivalent to those of the uncensored case. In section 5, we present new estimators defined for the case of a right-censored process with a bounded number of transitions [P0NS04]. The difficulty comes from the fact that we do not observe the next state after a right-censored duration in a state.

Under the ICR assumption, specific estimators of the distribution functions F\jj' and of the direct transition probabilities pjj' are deduced from Gill's estimator of the transition functions Fj'\j. A comparison of those estimators to the estimators for a general semi-Markov process leads to tests for an ICR model against the semi-Markov alternative (section 6).

For each individual i, i =1, ■ ■■ ,n, we observe, during a period of time ti, the successive states J(i) = (Jo(i), Ji(i), ■ ■■ , JK(i}(i)), where J0(i) is the initial state, JK(i}(i) the final state after K(i) transitions. The total number of possible states is assumed to be finite and equal to m. The successive observed sojourn times are denoted X (i) = (X1(i),X2(i), ■■■ ,XK(i}(i)), where Xk (i) is the sojourn time i spent in state Ju-i(i) after (k — 1) transitions, and the cumulative sojourn times are Tk = Sk=iXg.

One must notice that, if i changes state K(i) times, the sojourn time i spent in the last state JK(i} is generally right censored by ti — TK(i}(i), where ti is the total period of observation for subject i. We simplify the rather heavy notation for this last duration, and the last state JK(i)(i) as

The subjects are assumed independent and the probability distribution of the sojourn times absolutely continuous. The two models we propose for the process describing the states of the patient are renewal semi-Markov processes. Their behavior is defined through the following quantities:

2. The transition functions Fj'j(t) :

Fnj (t) = P (Jk = j' ,Xk < t\Jk-i = j) , j,j' G{1, 2, ••• ,m}. (2)

Equivalent to the set of the transition functions Fjij, is the set of the transition probabilities, p = {pjj> , j,j' G {1, 2, ••• ,m}, together with the set of the distribution functions Fjji of the sojourn times in each state conditional on the final state as defined below

1. The direct transition probabilities from a state j to another state j' :

2. The law of the sojourn time between two states j and j' defined by its distribution function:

m wher^ EPjj' =1, Pjj' -0, j,j'G{1,22,•••,m}. (5) j'=i

We notice that the distribution functions Fjj' conditional on states (j,j') do not depend on the value of k, the rank of the state reached by the patient along the process, which is a characteristic of a renewal process. We can define the hazard rate conditional on the present state and the next one,

= lim P(t < Xk < t + dt\Xk - t,Jk-i = j,Jk = j') (6) 1 w dto dt v '

as well as the cumulative conditional hazard

Let Wj be a sojourn time in state j when no censoring is involved, Fj its distribution function, and Fj = 1 — Fj its survival function, such that m

The potential sojourn time in state j may be right censored by a random variable Cj having distribution function Gj, density gj and survival function Gj. The observed sojourn time in state j is Wj A Cj.

A general notation will be F for the survival function corresponding to a distribution function F, so that, for example, F j = 1 — Fjj' and similarly, for the transition functions, Fj'j = pjj' — Fj'j.

We assume now that, starting from a state j, the potential sojourn times Wjj' until reaching each of the states j' directly reachable from j are independent random variables having distribution functions defined through (4). The final state is the one for which the duration is the smallest. One can thus say that all other durations are right censored by this one. Without restriction of the th generality, we assume that the subject is experiencing the fc transition. The competing risks model is defined by

Jk = j' such that Wjj' < Wjj", j" = j', (9)

where Wjj' has the distribution function jj'.

In this simple case, independence, both of the subjects and of the potential sojourn times in a given state, allows us to write down the likelihood as a product of factors dealing separately with the time elapsed between two specific states (j,j'). For the Independent Competing Risk model, one derives from (6), (8) and(9) that

A consequence is that the direct transition probabilities pf defined in (3) may be derived from the probabilities defined in (4), pjr = P (Jk+1 = j '\Jk = j)= j (u)e~Z j" Aljj" {n)du. (11)

In this special case, the likelihood is fully determined by the initial pj and the functions \\jj> defined in (6). The likelihood Lrc<n for the independent competing risks is proportional to n K(i)

Lrc,n = J} PJo(i) II X\Jk-l(i),Jk(i)iXk (i)) i=1 k=1

xe- E j.. Ajk-1(i)j,{Xk{i))e- £ j,, A]J,{i)jf,{X* (i)) (12)

It can be decomposed into the product of terms each of which is relative to an initial state j and a final state j'. When gathering the terms in Lrc n that are relative to a same hazard rate \\jjr or else A\jjt, one observes that the hazard rates appear separately in the likelihood for each pair (j,j')

LrcMf)=n n [x\jj' (xk (i))e-A 1 jj' (Xk(i))]1{jk-i(i)=j' j (i)=j'} i=1 k=1

This problem may be treated as m parallel and independent problems of right censored survival analysis. The only link between them is the derivation of the direct transition probabilities using (11).

The patients are assumed to be independent, while the potential times for a given subject are no longer assumed to be independent. We model separately the hazard rate and the transition functions pj, pjj' and \\jj' defined as in (1), (3) and (6). The direct transition probabilities pjj' can no longer be derived from the hazard rates.They are now free, except for the constraints (5). The distributions of the time elapsed between two successive states j and j ' and those of the censoring are assumed to be absolutely continuous. The likelihood Ln is proportional to n K(i)

Uli GJk-l(i)(Xk(i))pJk-i(i)Jk(i)\jk-i(i)Jk(i)(Xk (i)) i=1 k=1

31{J0^)=j} TT TTp,\\... (X. (i))e-A I j' X (i)) i=1 j=1 k=1 j' = 1

This likelihood may be written as a product of terms each of which implies sojourn times exclusively in one specific state j, Ln = YYjLi Ln(j)-

For each subject i, and for each k G{1, 2, ■ ■■ , K(i)}, we denote 1 — Sk(i) the censoring indicator of its sojourn time in the kth visited state, Jk-i(i), with the convention that ôo(i) = 1 for every i. If j' is an absorbing state, and if Jk(i) = j', then j' is he last state observed for subject i, k = K(i), and we denote it X*(i) = 0 and SK(i)+1(i) = 1.

Another convention is that subject i is censored, when the last visited state J*(i) is not absorbing and the sojourn time in this state X*(i) is strictly positive and we denote 1 — S^ the censoring indicator. In all other cases, in particular if the last visited state is absorbing or if the sojourn time there is equal to 0, we say that the subject is not censored and we thus have Si = 1. We can then write k

For each state j of {1, 2, ■ ■ ■ , m}, we define the following counts where k varies, for each subject i, between 1 and K (i), i G {1, 2, ■ ■ ■ , n}, and x > 0,

Ni,k(x,j,j') = 1{Jk-i(i) = j, Jk(i) = j'}1{Xk(i) < x}, Yi,k(x,j,j') = 1{Jk-i(i) = J, Jk(i) = j'}1{Xk(i) > x}, N?(x,j) = (1 — Si)1{J*(i) = j}1{X*(i) < x}, Y['(x,j) = (1 — Si)1{J*(i) = j}1{X*(i) > x}.

By summation of the counts thus defined on the indices j', i, or k, we get

By taking for x the limiting value to we define Nitk(j,j') = Ni,k (TO,j,j'), Nc(j) = N?(TO,j), N(j,j',n) = N(to, j, j', n), Nnc'(j,n) = Nnc(TOj, n), so that N(j,j', n) is the number of direct transitions from j to j' that are fully observed,N(j,n) is the number of sojourn times in state j, whose Nnc(j,n) (nc for not censored) are fully observed and Nc(j,n) (c for censored) are censored. For x = 0, we denote Yic(j) = Yic(0,j). The number of individuals initially in state j is N0(j,n) = ^rn=11{^o(*) = j}.

The true parameter values are denoted pj0 and pj0j/, and the true functions of the model are Fj/\j, F<jjj/, Fj, Gj and Ajjj/.

Let ln = log(Ln) and ln(j) = log(Ln(j)). The log-likelihood relative to state j is proportional to m ln (j) = log Pj N0 (j, n) +E N (j,j',n)log(pjj/)

^E E E Ni,k(j,j')[log(A\jj/(Xk(i))) — A\jj/(Xk(i))] i=1 k=1 j/=1 nm

Among the sum of four terms giving (15), let ln0 be the first term relative to the initial state, l^c (nc for non censored) the sum of the second and third terms, which involve exclusively fully observed sojourn times in state j, and finally ln (c for censored) the last term which deals with censored sojourn times in state j.

We denote Kn = maxi=1,2... ,n K(i) and nKn = ^2r!==1 K(i) respectively the maximum number of transitions and the total number of transitions for the n subjects. We consider two different designs of experiments, whether or not observations are stopped after a fixed amount K of direct transitions.

It is obvious that if the densities fj of the sojourn times, without censoring, for every state j, are strictly positive on ]0; t0[ for some t0 > 0, and if the distribution functions Gj of the censoring times are such that Gj (t) < 1 for all t > 0, the maximal number Kn = max K(i) of transitions experienced by a subject tends to infinity when n grows. If moreover the mean number of transitions Kn goes also to infinity, then the term relative to censored times ¡n (j) is the sum of terms of order n while the term ¡nc(j) is a sum of terms of order nKn. Therefore

Proposition 1. If Kn ^ <x and if Nnc(j,n)(nKn) positive number for every j G {1, 2, ■ ■ ■ , m}, then converges to a strictly ln j)

lim n-nK

nK n and the maximum likelihood estimators ofpjj>, Ajjr and F j are asymptotically equivalent to

Pjj'

Was this article helpful?

## Post a comment