## Some recent results on joint degradation and failure time modeling

Vincent Couallier

Equipe Statistique Mathématique et ses Applications U.F.R. Sciences et Modelisation, Universite Victor Segalen Bordeaux 2 146 rue Leo Saignat 33076 Bordeaux cedex FRANCE [email protected]

Key words: degradation process, hazard rate, hitting time, regression, correlated errors, generalized least squares estimation, Nelson-Aalen estimate, semiparametric estimation

### 1 Introduction

Analyzing survival data is historically and classically based on a sample of n real and non negative random variables (T1, ...,Tn) each measuring a individual time to event. This event can reflect a time of diagnosis, a death, a failure time, a breakdown or every change in the status of an individual but whatever the field of applications is, biostatistics, industrial statistics or econometrics, modelling or making inference on the distribution of the random variable T often requires additional data and precise definition of the sampling process. Some well known fields of research are devoted to failure time regression data, frailty models or competing risk models. Usually, one attempts to conditionally define the reliability characteristics such that the survival function, the hazard rate or the cumulative hazard rate given some random covariate describing either the frailty of the item, or the environmental conditions.

For instance, conditionally on the random vector A, the famous Proportional Hazard rate model by Cox specifies that the hazard rate verifies XT(tIA) = Xo(t) x A where Xo is a baseline hazard function and f3 is a vector of parameters. This model was first defined for constant in time covariate but it is possible to allow a time-varying effect of the environment on the survival of the item. [WT97] study the model XT(tIA) = Xo(t) x e^(A1+A21) where A = (A1A2) is some random but fixed in time vector of coefficients. Also, this unit-to-unit variability in the definition of the hazard rate can be interpreted as an individual frailty, modelled by a stochastic process and reflecting an internal accumulation of wear called aging or degradation process.

In fact an increasing hazard rate usually describes degradation in engineering applications [FIN03]. [BN04] define the conditional hazard rate given the random vector A as A(tIA) = Ao(t) x A(g(t, A)) where g is a given non decreasing function. Is is strongly related to the model of Wulfsohn and Tsiatis but the assumptions made for estimation and inference are completely different. If we consider that g(., A) is the degradation function with random coefficient A, we obtain a model where the degradation is varying unit-to-unit and influences the hazard rate of a so called traumatic failure time.

2 Joint models for degradation and failure time modeling

In this section, we are interested in defining the failure time due to wear or aging. This point has been an interesting field of research for the last twenty years. The most important probabilistic issue being to jointly model the degradation process and the associated failure times. Some recent results use the fact that when longitudinal data of degradation or markers of it can be measured before the failure occurs, it is possible to estimate reliability characteristics of the item. Numerous models exist now and we will present some of the most important ones.

Two main joint models exist. The first one considers a failure time which is directly defined by the degradation process, the second one considers that the degradation process influences the hazard rate. Obviously some joints models include both by considering multiple failure modes.

Let us assume that the degradation of an item is given by a real-valued right continuous and left hand limited stochastic process Z(t), t G I. Some well known reliability characteristics will be defined conditionally on Z but even without any information on Z, we assume that the life time To is in fact the first time of crossing a ultimate threshold zo (which can be random) for

if the degradation process tends to increase with time (the definition can obviously be reversed if Z models an anti-aging process). We shall define as well the failure time T with the conditional survival function given the past degradation process as

P(T > t\Z(s), 0 < s < t)= exp XT(s\Z(u), 0 < s < u)ds^j . (2)

The failure time To is sometimes called soft failure (or failure directly due to wear) because in most of industrial applications, zo is fixed and the experiment is voluntarily ceased at the time the degradation process reaches the level zo. Even if the degradation process is completely unknown, it can be useful and meaningful to analyze the links between the assumed distribution function of To and the definition of the underlying degradation process [AG01]. This issue is strongly related to the theory of hitting time of stochastic processes.

In the following section, we recall some well known results dealing with hitting times such that To. The analysis of the traumatic failure time T is related to accelerated life models and is postponed to Sect. 2.2.

2.1 Failure time as hitting times of stochastic processes stochastic degradation defined as diffusion

The degradation process is often assumed to be defined as the solution of a stochastic diffusion, for instance dX (t) = n(X (t))dt + aX(t)di(t), where 7 is often a Wiener process. The degradation path can also been given directly as the sample path of a given stochastic process. The gaussian process with positive drift

X (t) = a + bt + W (t), can be easily defined in both manners. In these cases, it is possible to link the distribution of the failure time T0 to the diffusion process and most of the parametric choices for survival distributions have a stochastic justification in terms of an unknown degradation process suitably defined. For instance, [SIN95] recalls that if Z0 is fixed then a Wiener process which starts out at a deterministic X0 will have a time to absorption with an inverse gaussian distribution. According to [COX99] this fact has sometimes been used to suggest the Inverse-Gaussian distribution as a good one for parametric analysis of failure data but this simple case has deeper generalizations and we refer to [SIN95] and [AG01] for further details.

Estimation procedures for degradation data based on Wiener processes (eventually measured with errors) have extensively been studied by [WHI95], [DN95], [WS97], [WCL98] and [PT04] among others. The mathematical property of independency of the increments of degradation data is obviously used to derive the likelihood functions needed for the estimation in such parametric models.

Stochastic processes such that general Levy processes, gamma processes or shock processes have also been studied as accumulative degradation processes. All of these processes have non-decreasing sample paths what is meaningful for the accumulation of wear.

A gamma process as degradation process

Following [BN01], the family of Gamma processes with right-continuous trajectories ensures the growth of the paths and the fact that, denoting

Z(t) = a2Y(t) with Y(t) - G(1, v(t)) = G(1, m(t)), a2

where G(l,v(t)) is a Gamma distribution with parameters 1 and v(t), the increment from t to t + At for At > 0 lies in the same family of distributions since Z(t + At) - Z(t) = a2A^(t) and

Then

A useful property for estimation procedures is the independency of increments. Parametric models include functions m(t) = m(t, 9) where 9 is a random coefficient but semiparametric methods (where m is nonparametric) have also been developed in [BN01]. Note that covariates describing some environmental stress are also considered and influence the degradation function as in an accelerated failure time model. [COU04] has studied the numerical properties in both models for constant in time stresses. See Also [LC04] who have recently used these methods to degradation data in a stressed environment with random effects.

### A marked point process as degradation

When the degradation consists in an accumulation of shocks, each of them leading to a random increment of degradation, it is possible to model it by a marked point process & = (Tn, Xn)n>1 where Tn is the time of the n-th shock and Xn is the size of the corresponding increment of degradation. Thus

is the sample path of the degradation and &(t) = Y1 1 I(tn < t) is the counting process associated to the marked point process.

In these models also, a failure time can be defined as first crossing time of a level zo. Due to the piecewise constant shape of the degradation, the failure time T0 is necessarily one of the times of shock. For a fixed level zo, some parametric estimations and large sample results have recently been provided in [WK04] under the multiplicative intensity assumption of Aalen. It assumes that the stochastic intensity function of the point process (Tn)n>1 is the product of a random variable Y and a deterministic function n(t), t e R+). Hence, given Y = y, the point process & is a Poisson process with deterministic intensity function y.r)(.). The process & is called a doubly stochastic Poisson process. It includes for different choices of the distribution function of Y and n the mixed Poisson process and the non homogeneous Poisson process.

In order to describe the distribution function of the soft failure T0, it is necessary to jointly define the intensity function of the point process and the distribution function of the marks. The simpler hypothesis is the independency between Tn and Xn for all n > 1. A dependent position distribution is easily defined if we let Xn = UneSTn where (Un)n>1 is a i.i.d. sequence of random variables. The condition S > 0 allows increasing effects of shocks on the degradation level and S < 0 leads to absorbed shocks. [WK04] and [KW04] discuss estimation procedures for inverse gaussian and gamma-type distribution of Y and different shapes of the function n (see also [BN02] chap.13).

A mixed regression as degradation process : the general path model

The real degradation is defined by the sample path of the process

where g is a deterministic non decreasing continuous function of the time and of the random coefficient 0 G Mp. The distribution function Fq of 0 is unknown. In this case, the hitting time of a given threshold zo is easily found by inverting g in its first parameter. There exists h such that

Thus, the distribution function of the failure time T0 can be written in terms of zo, h and of the distribution function Fq of 0.

For instance, [OC04] study linear degradation g(t,31,32) = ¡31 + [32t with fixed ¡31 and random ¡2 following Weibull(a, y) or log-normal(^, a2) distributions. Hence To follows a reciprocal Weibull or log-normal distribution respectively (see also [ME98]). In the case of noised degradation values, random coefficients are estimated unit-to-unit and a pseudo-failure time To is deduced from the degradation data. Finally classical survival analysis is applied to these pseudo failure times to analyze and estimate the real distribution function of the failure time To.

The observed degradation of the i-th item is often a partially and erroneous version of the real degradation. It is defined by the the vector of the mi measurements of the real degradation at given times ti1 < .. < tim.

where the ei are zero mean random variables. [BN04] have also assumed multiplicative errors in (4). In this case, we assume

and using Yj = ln Zij = ln Z(tij) = lng(tj, 0i) + ln ej = g(tij, 0i) + e j, as transformed degradation data, estimation procedures with additive noise are available (with suitable assumptions on eij).

Numerous models are based on (3) and (4) or (5) and differ only by assumptions made on the function g (or g) and the distributions of 9 and e (or e). [LM93] and [ME98] for instance use non linear mixed effect models in which g has a specified shape, 9 follows a multivariate gaussian distribution and unknown parameters are estimated by maximizing the global likelihood function. The distribution function of T0 is numerically estimated with Monte-Carlo method. The method is illustrated on the famous Fatigue Crack Growth Data by [BK85]. In the same framework and on the same data set, [RC00] use bayesian approach to estimate the unknown reliability characteristics such the failure time distribution of T0 and [GLJ03] propose an estimation procedure based on artificial neural network.

A slightly different approach consists in supposing that the unit-to-unit variability of the degradation paths has a part explained by the environmental conditions, that is 9 is a random parameter whose distribution depends on a external covariate describing the stress. In a purely parametric model, [YU03] and [YU04] study optimal designs of accelerated degradation experiments where a transformed degradation function is g(t) = -¡ta, a is fixed and known, ¡3 follows a Log-Normal(^,a2) or a reciprocal-Weibull distribution (with parameters varying with the stress) and the errors are gaussian.

Also [BBK04] study a linear degradation model with multiple failure modes. The degradation process is Z(t) = t/9 and is observed without error. This is a very simple path model but the complexity relies on the fact that no parametric assumption is made on the distribution of 9 and some competing failure times are censoring the degradation process.

2.2 Failure times with degradation-dependent hazard rate

The failure time T0 defined above and named soft failure is directly due to the evolution of the degradation because it is a crossing time. Another way to model a link between the degradation and a failure time is to consider that the degradation level is an internal covariate influencing the survival function of a traumatic failure T, through the conditional definition

P(T > tIZ(s), 0 < s < t)= exp (- XT(sIZ(u), 0 < s < u)ds

[COX99] notes that the essential point is that given the degradation history {Z (s), 0 < s <t}, the hazard rate Xt does not depend on time and considers as an example that XT(tIZ(u), 0 < u < t) = a + 3Z(t).

Remark 1 : This definition is equivalent to assuming that T is the first time a doubly stochastic Poisson process with intensity XT(sIZ(u), 0 < s < u) jumps (see [BN02] chap.3).

Remark 2 : This definition has an analogy with the hitting-time definition of lifetime seen in Sect. 2.1. In fact for T0 defined as the first time the degradation process reaches the threshold zo, we have

P(T0 > t\Z(s), 0 < s < t) = P(z0 > Z(t)\Z(s), 0 < s < t)

where Go is the survival function of zo. Hence, if zo is fixed then

because Go is the survival function of the dirac random variable whose realizations give zo almost surely.

Definition (2) is related to some well known accelerated failure time model with external time-varying covariate X(t),t > 0 satisfying

P (T > t\X (s), 0 <s <t) = G^J X (s),P)ds^

where G is a survival function and ^ is a positive function in the space of covariate values sometimes called transfer function or ideal time scale ([BN97], [DL00], [DL02]). Such conditional survival functions have also been studied by [YM97].

2.3 The joint model : a mixed regression model with traumatic censoring

Two failure mode are considered here. The failure time To is the first time the unknown real degradation Z(t) = g(t, 0) reaches a given threshold zo. As in Sect. 2.2, the traumatic failure time T is defined through its conditional survival function given the past degradation

P(T > t\Z(s), 0 < s < t) = exp xt(s\Z(u), 0 < s < u)ds^j

Noting that Z is a parameterized degradation function with random parameter 0, we restrict ourself here to the following assumption for conditional hazard rate xt .

XT(t\Z(u), 0 < u < t) = XT(t\0) = X(g(t, 0)) (6)

where X is a nonparametric function in the degradation domain.

3 Some recent results in semiparametric estimation in the general path model

We assume here that the degradation evolves in the same manner an unknown function g growths with time. The function g is a continuous parametrized function with unit-to-unit dependent parameters. Hence the real unobserved degradation of the i-th item is

where 0i G Kp is random vector of parameters with unknown distribution.

In order to get explicit formulae, we first restrict ourself to functions g(., 0) in (7) leading to a linearized problem of estimation. Nonlinear regression models are obviously useful in some applications but need numerical optimizations which do not provide estimates in closed form.

### 3.1 Linear estimation

For each item i, i = 1..n, we assumed that we have at our disposal the survival time Ui, the indicator Si and the degradation data as described in Sect. 2.3. The model considered here assumes that there exists a function F such that the fi available measurements Zi1, ..Zifi at times ti = (t\, ..,tf)' of the degradation level of the i-th item satisfy

where 0 i G Mp is a function of the random vector 0i G Mp and ti = (t 1, ..,tf)' is a fi x p-design matrix whose j-th raw is the raw-vector tj G Kp function of tj. Finally (tj)j=1..,fi is the vector of additive noises of the i-th item. Thus the function F permits to get a linear expression of the degradation in a transformed scale of time and we denote by Yi the vector of measures for the item i. For instance in [BN04], the real degradation of the i-th item, i=1,..,n, is Z(t) = ea1 (1 + t)a2, t e R+ where (a\,al2) = O* is a random vector with unknown distribution function Fg, and the data consist in Zij = Z(tij)Uij,j = 1,..,fi, i = 1,..,n. In fact, taking Yj = lnZj, we get that Yij = a\ + ai2 ln(1 + tij ) + ln eij and thus in that case Oi = Oi,, tj = (1, ln(1 + tij)) and êj = ln tij.

Three main assumptions are made for the correlation structure of the noises namely

H1 (ej)j=i..fi are i.i.d. random variables with

Eêj- = 0, covj j] = a2l{j1=j2} H2 (tj)j=i..fi are identically distributed random variables with E tj =0, cov [ej j ]= a2 (c(tji ) A cj )) where c is a positive nondecreasing function.

H3 (ej)j=i..fi are identically distributed random variables with

If the errors are gaussian, under H3, tj = a2W(c(tj)) where W is a brownian motion. Under H2, if the times (tj )j are equidistant, the ej is a AR(1) sequence of noises.

For item i, denote by a2Si the fi x fi-variance matrix of (ej )j). Hence a predictor for is found by generalized least square estimation

denoting X' for the transposition of matrix X. We recall the following well known results of generalized least square estimation

Proposition 1 : The predictor

6i = (f S-1f)-1 f S-lYi is also the ordinary least square estimate of the transformed data

9i = argminaeRp\\RiYi - Ritia\\, where Ri = (U* )-1 and Ui satisfies S = U* Ui

Proposition 2 :

under H1 we have R =Id, under H2 diag(R) =(1,1/^(4) - c(ti), •••^c(tif. ) - c(tf. -i)), Ri+ii = —Ri+ii+i and Rij = 0 elsewhere, under H3 with equally spaced measures, diag(R) = (— 4>2, 1.., 1), Ri+i,i = and Ri j = 0 elsewhere.

These results use the Cholesky decomposition of matrices and are valid if the nuisance parameter <P and c are known. If ^ is unknown, it can be estimated with two-step procedures or iterative estimation ([SW03], p. 279). Interestingly, under H2 the estimation procedure involves only the standardized increments (Yj — Y—-^ / ^J c(tj) — c(tj_i) and we have

Proposition 3 : In the model

Yij = &[ + ^2 f (tj ) + ?j, where f is any increasing function, under H2 we obtain

where, if we denote dl(tj) = l(tj+i) — l(tj) for any function l with dl(t0) = dl(tf ) = 0 , * fi-i

j=o and ( ) ^ = (4f (tj-i)/c(tj-i) — df (tj )/c(tj ))