## Phfnf e P sup fj exp cme2

6.2 Hellinger and Kullback-Leibler distances.

Let P and Q be two measures both dominated by a a—finite measure p, H2(P,Q) be the Hellinger distance between P and Q,

dp y dpj

Consider the Kullback-Leibler distance

Here P is the probability distribution with density f with respect to the measure j.

Let Xi,... ,Xn be i.i.d random variables with the common distribution P GP and density f gF, Pn be the empirical distribution

Suppose we have to estimate the unknown density f gF on the observations X1,..., Xn,... with common distribution P G P. R. Fisher suggested to minimize the functional K(-,f) on empirical data to choose estimator fn. Namely, we put

Here f is the true density. Let fn be a point of F which minimizes the functional Kn(-,f):

J In f/fn) dPn < J In (f/g) dPn for all g gF. (15)

The estimator fn of f which is defined in (15), is called the maximum likelihood estimator since fn is a point of maximum, on F, for the likelihood function L (function of g)

Let S and D be two classes of nonnegative functions, such that for any s G S and g G G

We suppose that

We denote by P (f) the distribution with density f.

Now suppose we have to estimate unknown function s G S on the observations Xi,. ..,Xn,... with common distribution P G P and density f = s g G F. The maximum likelihood estimator "sn of s is defined by the relation j ln(s/sn) dPn < j In (s/g) dPn, g G S. (16)

Estimation Of Density For Arbitrarily Censored And Truncated Data 261 Lemma 2. Suppose that P is the true distribution with density f = Sg, then

Proof. It is clear that it is sufficient to prove that ln — dPn > 0.

It follows from (16).

Lemma 3. Suppose that gn is a nonnegative random function, such that

P(sgn) is the distribution with density sgn, then

Proof. Lemma 4 can be proved in the same way as lemma 2. Denote gn the maximum likelihood estimator of g, j ln(g/gn) dPn < j ln(g/h) dPn, h G G.

Corollary 1. Let P(sgn) be the distribution with density sgn, then

0 < j ln SdP(sgn) < i ln ^ d(Pri — P(sgn)). sgn s n f>0 f>0

Corollary 2. Let P(sgn) be the distribution with density sgn, then / \s — Sn\gn dp <

Proof. Corollary 1 follows from lemma 4 if we take gjn = gn.

Lemma 4 (Sara van de Geer). Let P be the true distribution with density f G F, and fn be the maximum likelihood estimator for f, then h2(fn,f) < j (M - 1) d(Pn - P).

Lemma 5. Suppose that gn is a nonnegative random function, such that

P(sgn) is the distribution with density sgn, then

[v~s - 2 g dM < / ( JS - 1) d,(Pn - P(sgn)). (17)

Proof. We rewrite the proof of Sara van de Geer [VdG93].

= J ( ^ - l) d(Pn - P(SVn))+ j f - M dP(sgn). f>0 ^ ' f>0 ^ '

Since

/ (i - \fr i dP (sgn) = 2 J (Vs - gg d¡, f>0 f>0

Corollary 3. Let P(sgn) be the distribution with density sgn, then

J - W) 2 gn dp < J Í ^ - M - P(S9n))-f>0 f>0 ^ '

6.3 Estimation in the presence of a nuisance parameter

Now we consider the following case

F = {f : f (x) = fs,g(x) = s(x) g(x), for some s 6 S and g 6 G }

Here s is the parameter of interest, g is the nuisance parameter. Let p be a metric on S. Denote

a,a„,g where inf is taken on all g G D and all s,s* G S such that p(s, s*) > e. It is clear that if h(fa,g ,fat,g ) < S(e), then p(s, s*) < e.

follows

But condition (19) is never carried out. Therefore we need to assume that inf in (18) is taken over all g G D such that g G V(g*)

Here g* is a known point of D, V(g*) is a neighborhood of g*. It is clear that function S(e) depends on V(g*).

We denote Pf the distribution with density f.

Lemma 6. Let sn,gn be estimators of s,g, and V(gn) a neighborhood of gn such that

sup Pf {h(fa,g,fan,gn) > e}^ 0, as n ^tt, f—fs, geF

Let W,Wi,..., Wn be i.i.d. random vectors, W = (L(X), R(X), L(Z)), with unknown density v f f(x) dx p(u, v, z) = pr' f (u, v, z) = r(u, v, z) x

We use notation fn for maximum likelihood estimator of f.

We suppose that the baseline density r and density f belong to given sets G and F correspondingly. And denote

We assume that the parametric set P is totally bounded in the Hellinger metric. Moreover, for a constant C = Cp and e > 0 there exist finite coverings

V (e) = {V (fL, f R), i =!,■■■, m} and W (e) = {W (rf, rR), j = l,...,k}

FcU V(fL,fR), GcU W(rf,rR); i=l j=l and finite covering

U (e) = {U (pLj ,pR), i = l,...,m; j = l,...,k} , of the set P : P c U U(pfj ,pR3), such that i,j

C {p : p = pr, f, for some r e W(rf, rR), f e V(fL, fR)} C Uj = U(pF,p* );

C2 h(pf}j,pRj) < e,h(fL,fR) < e; C3 J pRj dz < Cp, f f R dx < Cp; C4 for any e > 0, z0> 0

Theorem 2 (consistency of the Non Parametric Maximum Likelihood estimate of f). Under conditions Ci - C4 for any e > 0

sup P I h ( fn, f) > e> ^ 0, as n p = pr.f eP ^ ' '

References

[BiM98] L. Birgé and P. Massart. Minimum contrast estimators on sieves:exponential bounds and rates of convergence. Bernoulli, 4:329-375, 1998.

[DeL01] L. Devroye and G. Lugosi Combinatorial methods in density estimation. Springer-Verlag, 2001.

[Fin04] J. P. Fine, M. R. Kosorok, and B. L.Lee Robust Inference for univariate proportional hazards frailty regression models. AS, 32, 4:1448-1491, 2004.

[FMD93] D. M. Finkelstein, D. F. Moore, and D. A.Schoenfeld A proportional hazard model for truncated aids data. Biometrics, 49:731-740, 1993.

[Tur76] B. W. Turnbull. The empirical distribution function with arbitrary grouped, censored and truncated data. Journal of the Royal Statistical Siciety, 38:290-295, 1976.

[HbV04] C. Huber-Carol and F. Vonta. Semiparametric Transformation Models for Arbitrarily Censored and Truncated Data . in "Parametric and Semiparametric Models with Applications to Reliability, Survival Analysis and Quality of Life", Birkhauser ed. ,167-176 , 2004.

[VdG93] S. Van de Geer. Hellinger-consistenscy of certain nonparametric maximum likelihood estimators. The Annals of Statistics, 21:1444, 1993.

[WSh95] Wing Hung Wong and N. Xiatong Shen. Probability inequalities and convergence rates of sieve mles. The Annals of Statistics, 23, 2:339-362, 1995.

[Sh97] N. Xiatong Shen. On methods sieves and penalization. The Annals of Statistics, 6:339-362, 1997.

[ACo96] A. Alioum and D. Commenges. A proportional hazard model for arbitrarily censored and truncated data. Biometrics, 52:512-524, 1996.

[Fry94] H. Frydman. A note on nonparametric estimation of the distribution function from interval-censored and truncated observations. Journal of the Royal Statistical Society, Series B, 56:71-74, 1994.

[NiS02] M. Nikulin and V. Solev. Testing problem for Increasing Function in a Model with Infinite Dimensional Parameter. In: C.Huber-Carol, N.Balakrishnan, M.Nikulin, M.Mesbah (eds) Goodness-of-fit Tests and Model Validity. Birkhauser: Boston, 477-494, 2002.

[NiS04] M. Nikulin and V. Solev. Problème de l'estimation et e-entropie de Kolmogorov. In : E.Charpentier, A.Lesne, N.Nikolski (eds) L'Héritage de Kolmogorov en mathématiques. Belin : Paris, 121150, 2004.

Statistical Analysis of Some Parametric Degradation Models *

Waltraud Kahle and Heide Wendt

Otto-von-Guericke-University, Faculty of Mathematics, D-39016 Magdeburg, Germany [email protected]

Summary. The applicability of purely lifetime based statistical analysis is limited due to several reasons. If the random event is the result of an underlying observable degradation process then it is possible to estimate the parameters of the resulting lifetime from observations of these process. In this paper we describe the degradation by a position-dependent marked doubly stochastic Poisson process. The intensity of such processes is a product of a deterministic function and a random variable Y which leads to an individual intensity for each realization. Our main interest consists in estimating the parameters of the distribution of Y under the assumption that the realization of Y is not observable.

### 1 Introduction

One of the simplest models for describing degradation is the Wiener process with linear drift. The design of the mathematical model is based on the assumption of an additive accumulation of degradation without any variation in the tendency of the degradation intensity. Some of such models and their parameter estimations are described in [KaL98]. A similar model and its application in medicine is described in [DoN96]. Several generalizations of this model were given. It is possible to include measurement errors [Whi95], or to transform the time scale [WhS97]. Some more general models have been developed in [BaN01] and [BBK02]. The advantages of using the Wiener process and its generalizations for describing the damage process are its simple form (at least for the univariate Wiener process) and, secondly, that a statistical analysis can be carried out for observations at any discrete time points. But these models have also disadvantages: It is possible that the damage is decreasing in any interval, which is difficult to interpret in practical applications. The second disadvantage is that these models become very complicated if a nonlinear drift is assumed. But for many products we can expect an increasing damage which becomes faster over time.

*This research was supported by DFG # Ka 1011/3-1

Actually, we consider a degradation process (Zt) whose paths are monotone increasing step functions. For modeling it, we use marked point processes \$ = ((Tn,Xn))n>i, presented in detail e.g. in [LaB95] or [ABG93]. The cumulative process (Zt) is assumed to be generated by a position-dependent marking of a doubly stochastic Poisson process (Tn). The doubly stochastic Poisson process was introduced by Cox [Cox55]. Cramer [Cra66] applied it in risk theory, and Grandell [Gra91] gave a detailed discussion of these processes and their impact on risk theory. Further applications of the doubly stochastic Poisson process (Tn) may be found in reliability theory, medicine and queuing theory [Bre81], [ABG93], [Gra97]. Our aim is to describe suitable models for degradation accumulation. In section 2 the model is described. In section

1 and 4 maximum likelihood and moment estimates are found for the (in our view) most interesting parameters of the model. Section 5 contains some results of a simulation study.

We consider a shock model which is well known in reliability. The random variable Tn (n > 1) is the time of the n-th shock. We suppose

Tn < Tn+i if Tn < to and Tn = Tn+i = to otherwise .

Every shock causes an random increment of degradation. The size of the nth increment of the cumulative degradation process (Z(t))t>o is given by a nonnegative random variable Xn (n > 1). Thus, describes the total amount of degradation at time t. The sequence \$ = ((Tn, Xn)) is called a marked point process, and \$(t) is defined as the random variable representing the number of events occurred up to time t. Frequently, it is of interest to discuss the first passage problem that the process (Zt) exceeds a pre-specified constant threshold level h > 0 for the first time. This first passage time is the random lifetime of the item. It is also possible to regard a (random) state of first Xo at time To := 0. Then the corresponding first passage time Zh is given as

Z(t) = 12 1 (Tn < t)' Xn n n=l where I (Tn < t) is an indicator function:

Let us mention Zh coincide with some Tm for m G N.

Now we make some assumptions to specify the degradation model.

2.1 The distribution of (Tn)

The cumulated stochastic intensity v(t) of (Tn) is assumed to be given by v(t) = Y ■ n(t), where n(t) is a deterministic function with derivative £(t) > 0. Hence, given the outcome Y = y the random number &(t) of shocks up to time t is Poisson distributed with mean y ■ n(t). Each realization of the degradation process has its own individual intensity. Consequently, it is possible to model different environmental conditions or different frailties for each individual. The unconditional distribution of @(t) is given by