## Qi

F(t|ti, 2,z F(t|t2, 3,z f(t,j \to,jo,z f(t,j \ti,ji,z f (t,j lt2,j2,z

= exp[-e^24*[A24(t) - A24(ti)] , = exp[-e&z+p[A34(t) - A34(t2)]

= ahJ (t)ejj z F (t|ti ,ji,z), (ji,j) = (1, 3), (2, 4) , = aj2,j (t)ejj z+pF (t|ti,j2,z), (j2,j) = (3, 4) .

The first three terms of the density p(V, Zo, 4) represent likelihood contributions corresponding subjects who are censored, whereas the last two terms are likelihood contributions for subjects who died of melanoma.

For any measurable function &(V,Z), its conditional expected given the data is

= Em=o,i $(V, j, m)es'mZ1 p(v, j, m, 4) = Em=o,i e°tmZlp(V,j,m,4) = Ei=o,i$(V, j, m)eeTmZlp(V, j, m, 4) = E*=o,i edImZlp(V,j,m; 4) = Ei,m=o,i \$(V, j, m)eeTmZlp(V, j, m; 4) = Ei,m=o,i ee-Zlp(V,j,m,4)

if R = (1,0),Zoi = l if R =(0,1),Zo2 = m if R = (0, 0) , where 8{

In the EM algorithm, we replace the density p(V, Zo,4) by the function p(V, Zo,4q), where 4q is the estimate of the 4 parameter at the q-th step of the algorithm. In particular, in the case of the completely observable covariates, the score function for estimation of the 0 parameter is given by

for (i, m) = 0. In the case of the missing covariates, the corresponding score function is

V^Q2(4\4) = £ Zhk [g(i, m\Zik, 0) - g(i, m\Zik, 0)]

k=i where

The conditional expected entering into the score function for the regression coefficients ¡3 are also simple to evaluate.

Numerous studies have shown that females have better survival rates than males. The primary reason for better performance of women is that their melanomas tend to occur more frequently on extremities, which is a more favorable location. Patients with melanomas located on extremities have in general better survival rate than patients whose primary lesion is located on trunk or head and neck. This is also shown by our results in Table 3.1. We found that primary tumor located on the extremities decreased the risk of all transitions. Positive site x gender interaction in the case of all transitions, except 0 ^ 1, indicates that primary site (extremities) decreases the rate of transitions among various states of the model but this effect is less marked among men.

Further, age associated with increased risk of lymphnode and distant metastases. Males experienced an increased risk of transition from state 0 to state 2, however, the negative (age) x (gender) interaction suggests that this increased risk is less pronounced among older men. Among pathological factors, Clark's level of invasion > III and Breslow's depth > 1.5 mm is associated with increased risk of the transition from state 0 to both state 1 and 2.

Table 3.1 compares results obtained from two regression analyses corresponding to the MAR model and the "case deletion" model (parenthesized regression coefficients and standard errors). In both cases we used partition of the observed range of the transition times into 10 intervals corresponding to a equidistant partition of the observed range of the transition times. The range of the observed transition times was (0, 6.8) years for transition 0 ^ 1, (0.93, 9.75) years for transition 0 ^ 2, (0.69, 8.34) years for transition 2 ^ 3, and (1.39, 9.08) years from transition from state 3 ^ 4. In the case of transitions between states 0 ^ 2, 2 ^ 3 and 3 ^ 4 the results from both analyses are quite similar, though the standard errors of the estimates obtained based on the MAR model are uniformly smaller as a result of the increased sample size. On the other hand, in the case of the transition from state 0 to state 1, the results differ. The MAR model suggests that location of the primary tumor and site x gender interaction are important risk factors for progression from state 0 into state 1, whereas the "case deletion" model does not identify these factors as significant.

### Appendix 1

Let lg and lg denote the first and second derivatives of the density gg with respect to 0, and for k = 1,...,n, let Mhkk (t, -) = Nhkk (t) - Zhk Ah,k(t). We use Louis (1982) formula to get observed information,

^n(6) = Sri[£iri(^) - 12n(6)] , where the first term in an estimate of the complete information and the second is an estimate of the expected conditional covariance of the score function given the data. The matrix Sin(6) is the negative Hessian of the Qn(6\6) function with respect to the first argument. Similarly, E2n(6) is the negative derivative of ^Qn(6\6) with respect to the second argument. For q = 1, 2, we have

Sin3(6) = diag[E n=iNh'k (Ip) : p =1,..., £(n), h G Eo] , aph

 factor P se P se state 0 ^ state 1 state 0 ^ state 2 Age 1.46 0.38 1.39 0.34 (1.31 0.46) (1.34 .41) Gender 0.31 0.14 (male vs female) (0.42 0.18) Clark 0.52 0.11 0.49 0.12 (> III vs < III) (0.42 0.13) (0.42 0.15) Depth 1.50 0.41 1.63 0.53 (> 1.5 mm vs < 1.5 mm) (1.60 0.54) (1.67 0.60) Site -0.47 0.14 -1.20 0.17 (extremities vs other) (-0.25 0.19) (-1.08 0.19) age x gender -0.56 0.16 interaction (-.52 0.18) site x gender 0.42 0.20 0.87 0.20 (0.41 0.24) (0.74 0.25 ) state 2 ^ state 3 state 3 ^ state 4 Age 1.53 0.50 -0.43 0.34 (1.70 0.59) ( -0.31 0.42) Clark 0.58 0.96 0.35 0.43 (> III vs < III) (0.66 1.09) (0.39 0.50) Depth 0.6 0.73 0.8 0.91 (> 1.5 mm vs < 1.5 mm) (0.69 0.82) (0.89 1.12) Site -0.76 0.20 -0.17 0.10 (extremities vs other) (-0.82 0.25) ( -0.27 0.13) Age x gender -0.34 0.18 0.29 0.10 ( -0.45 0.23) (0.32 0.14) Site x gender 1.08 0.25 (1.25 0.30) prior lymphnode NA NA 0.25 0.09 metastasis NA NA (0.30 0.20)

Estimation in a Markov chain regression model with missing covariates 113 and £r/W = . For q = 2,