## N logYi1

If the assumed parametric model is a good description of the of the underlying population, then parametric estimators and test procedures based on these estimators provide good results. But if the parametric model is not appropriate such an approach can lead to wrong conclusions. This is demonstrated in the following: Suppose that a mixture of two Weibull distributions is considered. The first group is characterized by parameters Ii,vi and the second with I2,v2, and let p be the portion of the first group. Then the survival function is given by

S*(t) = (1 - p) exp(-(t/fh)V1) + p exp(-(t/I2)V2) (2)

For f31 = 1, I2 = 4, v1 = 2, v2 =4 and p = 0.05 Figure 1 shows S*, the density f* and the hazard rate A* of the mixture (solid line). Further the  Fig. 1. (a) Survival functions, (b) Densities, (c) Hazard rates, for the main component (dashed line), for the mixture (thin solid line), in (c) the hazard rate for the minor component (bold solid line)

main part of the mixture, i.e. exp(-(t/fii)Vl is given in (a), in (b) and (c) you see not only this term of the mixture but also the minor one. In such a case with a small p one can interprete the first Weibull distribution as a disturbation of the second one and one would hope that the fit with a single Weibull distribution is sufficiently well. Simulated data with 100 observations from the disturbed Weibull model were used to estimate the parameters fi and v in a single Weibull model with tv-1

S (t) = exp(-(t/fi)v) and A(t) = —, which was assumed neglecting the inhomogenity of the population.

The maximum likelihood estimates, computed according to (1), are: ¡3 = 1.057 and 3 = 1.422. Replacing these estimates into the functions S and A we get Figure 2.

We see: The estimators using the single Weibull model are wrong estimators.  (a) (b) Fig. 2. a) Survival functions, (b) Hazard rates, (c) Densities, for the estimated single Weibull model (bold dashed line), for the mixture (thin solid line)

This model is unable to detect the features of the underlying functions! Such a mixed distribution one meets if the underlying population is not homogenous. A latent factor, which is not observed divides the population into (for simplicity) two groups. Further, assume that both groups can be characterized by a Weibull distribution: the first with parameters ¡3i,vi and the second with ¡2,^2, and let p be the portion of the first group. Latent factors can be: a not observed underlying disease (depression), different litter in an animal experiment or different producer of a technical component.

2 Nonparametric Estimators

### 2.1 Model with censoring

Very often, in practical applications the life times Yj's are subject to random right censoring, i.e. some individuals may not be observed for the full time to failure. Thus, our observations are values of r. v.'s T which are censored or uncensored. Here we assume a random censoring scheme characterized by i.i.d. r. v.'s Ci which are independent of the Y- sequence. Thus, we observe (Ti, Si), i = 1,...,n with

The distribution of the observations is described by the distribution function and the subdistribution function of the uncensored observations

H(t) := P(Ti < t) and HU(t) := P(Ti < t,Si = 1).

2.2 The Nelson-Aalen estimator for the cumulative hazard function

Starting point of the construction of an estimator for the hazard function A and the survival function S is an estimator for A, the cumulative hazard function defined by

Using standard transformations we can write this estimator in the following form

The idea for the estimation of A goes back to [B81]. He proposed to replace the functions H and HU in (3)by their empirical versions

HU(t) = -T\ 1(Ti < t,Si = 1), Hn(t) = -J2 1(Ti < t). (4)

The resulting estimator is the so-called Nelson-Aalen type estimator

n ' Jo 1 - Hn(s-) The explicit formula of An is given by

Here T(i) < ■ ■ ■ < T(n) is the order statistic, and S[i] = Sj if Tj = T(i).

From this estimator we get the well-known Kaplan-Meier product limit estimator by the transformation

Asymptotic properties of these estimators were investigated by several authors, for example by [H81], [LS86] and [MR88].

### 2.3 A kernel estimator for the hazard function

The hazard function A is the derivative of the cumulative hazard A. But the estimator An is not differentiable. So, we follow the same line as in the case of nonparametric density estimation. Let us estimate A at point t. Consider a small interval [t — b,t + b) of length 2b around t. We can approximate A(t) in the following way:

W., t-b__A(t + b) — A(t — b) An (t + b) — An (t — b)

The last term in (5) can be written in the form 1 ^ K *(t — T(i)\ %)

where

The first approximation step in (5) yields a systematic error, which becomes small if the length of the interval is small. At the other hand, if b is small, then the second approximation error, the stochastic error, is large, because we have not enough observations for stability. To take these tendencies into account, we have to choose b depending on the sample size n, b = bn, such that bn ^ 0 and nbn ^ (6)

Further, it is useful to take instead of the function K* a more general function K, a function giving small weights to observations T^) far away from the point t and large weights to observations very near to the point, at which we estimate. This is realized, for example, by taking a symmetric density function for K. So, finally we arrive at the following definition:

Here K : R ^ R is the kernel function and {bn} the sequence of bandwidths satisfying (6). The estimator (7) can be written shortly as

Several properties of this estimator are known. Let us mention here papers [SW83], [TW83] and the results in [DS86]. In these papers conditions for consistency are derived and asymptotic expressions for the bias and the variance are given. Diehl and Stute considered an approximation for the difference between the estimator \n and a smoothed hazard rate by a sum of i.i.d. r.v.'s. On the basis of such a representation limit theorems can be derived.

The following picture shows a nonparametric kernel estimate for the data generated in the simulated model (2). Here the kernel function is the Gaussian kernel, the bandwidth is bn = 0.6. We see, that this estimate reflects the features of the underlying hazard function much better than the parametric estimator. Fig. 3. True underlying hazard rate (thin) and nonparametric estimate (bold)

3 Testing the Hazard Rate

Nonparametric estimators of a curve are an appropriate tool in the analysis of data. But, sometimes in practical situations it seems to be useful to have a parametric model. The advantage of a parametric model is that the parameters have a some meaning, very often they can be interpreted. Of course, this holds only, if the chosen parametric model is appropriate. Thus, the question arises, whether the choice of a certain parametric model can be justified by the data. In this section we propose a test procedure for checking whether a hypothetical model fits the data, that is we consider the following hypothesis

H : X eC vs. K : X <£C, where C is the class of parametric hazard functions

An example for such an parametric class C is the set of all Weibull hazards. Further parametric models are given in the book [BN02]. At the first view one would choose as test statistic the deviation of the nonparametric estimator Xn, which is a good estimator under the alternative, from a hypothetical hazard with estimated parameter i.e. from X(t; \$). Here \$ is an appropriate estimator of the unknown parameter. But the nonparametric Xn is a result of smoothing procedure. Remember formulae (5) - it is an unbiased estimator of

1/Kit-)*,)*,, and not unbiased for the underlying hazard rate. So, it seems to be natural to compare Xn, which smoothes the data, with a smoothed version of the hypothesis. Thus, we will take the difference between Xn and Xn defined by