## Estimation in a Markov chain regression model with missing covariates

Dorota M. Dabrowska1, Robert M. Elashoff1 and Donald L. Morton2

1 Department of Biostatistics, University of California, Los Angeles, CA 90095-1772

2 John Wayne Cancer Institute, Santa Monica, CA 90404

Summary. Markov chain proportional hazard regression model provides a powerful tool for analysis of multiple event times. We discuss estimation in absorbing Markov chains with missing covariates . We consider a MAR model assuming that the missing data mechanism depends on the observed covariates, as well as the number of events observed in a given time period, their types and times of their occurrence. For estimation purposes we use a piecewise constant intensity regression model.

### 1 Introduction

Missing covariate measurements arise frequently in regression analyses of survival time data. The most common approach to handling such measurements corresponds to the case deletion method. It consists of exclusion of subjects with missing covariates and analysis of the data based on information collected on the remaining subjects. This method can be highly inefficient, and can lead to biased estimates, if complete cases do not form a random sample of the original data (Little and Rubin, 1987).

Several authors have proposed methods for analysis of the proportional hazard model with missing covariates. In particular, Zhou and Pepe (1995) and Lin and Ying (1993) suggested methods for regression analysis in the case of covariates missing completely at random (MCAR). This model assumes that the distribution of the missing data mechanism does not depend on the outcome variables. The approach taken by Zhou and Pepe and Lin and Ying corresponds to estimation of regression coefficients based on modified partial likelihoods obtained by approximating the conditional expectation of a covariate Z(t) given the risk process based on subjects who have complete measurements and remain at risk for failure at time t. Martinussen (1999) and Chen and Little (1999) considered the more parsimonious model assuming that covariates are missing at random (MAR). Under assumptions of this model, the missing data mechanism may depend the observed data, but not on the values of the missing covariates. Several methods for handling missing covariates in both proportional hazard model and parametric survival analysis models were also proposed by Lipsitz and Ibrahim (1996, 1998), Chen and Ibrahim (2001).

In this paper we consider estimation in a multivariate counting process corresponding to a finite state proportional hazard Markov chain model (Andersen, et al. 1993, Andersen, Hansen and Keiding, 1992). As opposed to single endpoint models, here inferences refer to a stochastic process {J(t),t e [0,r]} such that at time t, J(t) takes on values in a finite set E = {1,..., k} representing possible events in the evolution of a disease. Along with a possibly censored realization of the process J(t), we also observe a vector of time independent covariates Z. The model assumes that conditionally on Z, the process J(t) forms an inhomogeneous Markov chain and intensities of transitions among adjacent states have proportional hazard form. In Section 2 we allow some components of the vector Z to be missing. We define a MAR model, assuming that the missing data mechanism depends on the observed covariates as well as the number of events observed during a specified period of time, their types and times of the occurrence. We consider censoring corresponding to the termination of the study at a fixed time point t, and random censoring representing an absorbing state of the observed model. For purposes of estimation of the parameters of the Markov chain we use a modification of Freedman's (1982) approach to analysis of the proportional hazard model with piecewise constant hazard rates. The method uses histogram approximation to the hazard rates and a piecewise linear approximation to cumulative intensities.

In their analysis of MCAR and MAR models, Little and Rubin (1987, p. 90) showed that the likelihoods for estimation of the parameters of interest are the same under assumptions of both models. More precisely, the likelihoods differ only in the proportionality factors depending on the parameters describing the missing data mechanism at hand. In the present setting, the MCAR model allows for estimation of the unknown parameters as well as estimation of a modified matrix of transition probabilities (Section 2.1). On the other hand, the MAR condition depends on the sequence of states visited and times of entrances into these states. Similarly to Little and Rubin (1987, p. 90), the likelihood for estimation of the regression coefficients is the same under the MAR and the MCAR model (up to proportionality factors), however, we show that the MAR model does not allow in general for estimation of the transition probabilities among different states of the model.

For illustrative purposes in Section 3 we use data on 4141 patients diagnosed with malignant melanoma and treated at the John Wayne Cancer Institute, Santa Monica (JWCI) and UCLA. The JWCI/UCLA database was initiated over 30 years ago. This clinio-pathological demographic database has been established in late 1970's and identified as a national resource for melanoma studies. The database has expanded in the number and type of variables included as well as addition of clinical trials data developed by JWCI. Data analyses of this database encountered the not uncommon sit uation where observations on some important variables are missing. Thus for example, depth of invasion of the primary tumor has not been observed in a moderate number of cases.

### 2 The model and estimation 2.1 The model

Throughout we consider estimation in a finite state Markov chain process. We assume that the chain {J(t) : t G [0,t]} is observed over a finite time period (t < and its state space E = {1,...,k} can be partitioned into two disjoint sets TU A = E, T nA = 0, representing transient (T) and absorbing (A) states. A pair of distinct states (i,j) G E x E,i = j is called adjacent if transition from state i to state j is possible in one step. The collection of all such adjacent pairs is denoted by E0, E0 C E x E.

A Markov chain regression model can be specified in terms of two parameters. They correspond to (i) the joint marginal distribution of the initial state Jo and the covariates Z = (Zi,..., Zd); and (ii) the conditional cumulative intensity matrix A(t; z) = [Aij(t; z)]i,jeE. The entries of the matrix A(t; z) are given by

For i = j, the functions aij (u, z) represent conditional hazard rates of one-step transitions among adjacent states of the model. The negative on-diagonal entries form cumulative hazard functions accounting for the sojourn time in each state of the model. The (i,j) entry of the conditional transition probability matrix

P (s,t; z) = [Pij (s,t; z)]i,j=i,,.,k = [Pr(J (t) = j\J (s) = i,Z = z)]i,j=i,,.,k provides the conditional probability that the process occupies state j at time t, J(t) = j, given Z = z and given that that at time s, s <t the process is in state i. The matrix P(s,t; z) forms solution to Kolmogorov equations

P(s,t; z) = I + P(s,u-; z)A(du; z) = I + A(du; z)P(u,t; z) , ss where I is the identity matrix. Methods for its computation are discussed in Chiang (1968), Aalen and Johansen (1978) and Andersen et al (1993), among others.

Associated with the pair (Z, {J(t),t G [0,r]}) is a marked point process {Z, (Tm, Jm)m>o}, where 0 = To <Ti < ... < Tm < ... are times of consecutive entrances into the possible states of the model, and Jo, Ji,..., Jm,... are states visited at these times. Let Wm = (T£, J£ : I = 0,...,m),m > 0, be the first m pairs in the sequence (Tm, Jm)m>o. The assumption that, conditionally on the covariates, the process J(t) forms a Markov chain entails that

Pr(Tm < t,Jm = jm\Wm-i ,Z) = f (u,jm\Tm-i, Jm-i, Z) ,

where f (u,jm\t m— 1, jm-i, Z) = 1( u > tm — 1)F (u\tm-i ,jm — U z)ajm-i,jm (u; z) and

The function F(■\tm-i,jm-i,z) represents the conditional survival function in state jm-i, that is

Finally, the probability of one-step transition into state j at time Tm is given by

From this it also follows that lim - Pr(Tm G [t,t + s],Jm = j\Z, Tm > t, Wm-i) =

If we denote by Nh(t) and Yh(t),h G E0, the processes

Nh(t) =Y Nhm(t) , Yh(t) =Y, Yhm(t) , m> i m> i where

Nhm(t) — l(Tm-i < Tm < t, Jm-1 — i,Jm — j) , Yhm(t) — l(Tm > t > Tm-1, Jm-\ — i) , then N(t) — {Nh(t) : t G [0,r],h — (i,j) G Eq} is a multivariate counting process whose components record transitions among adjacent states occurring during the time interval [0,t]. Assuming that the process N(t) is defined on a complete probability space (f, F, Pr), its compensator A(t) — {Ah(t) : t G [0,r],h G Eq}, relative to the self-exciting filtration {Q ® Ft}t<T, Q — a(Z), Ft — a{Jo, Nh(s),Yh(s+) : s < t,h G E0} satisfies

Ah(dt) — E[Nh(dt)| Q x Ft-] — Yh(t)ah(t; Z)dt .

The assumption that the process forms a proportional hazard Markov chain corresponds to the choice

Ah(dt)— Yh(t)epTZhah(t)dt , where a — [ah : h G Eq] are unknown baseline hazards, [Zh : h G Eq] is a vector of transition specific covariates, and f3 — [fih : h G Eq] is a conformal vector of regression coefficients. For any pair of adjacent states, h G Eq, the vector Zh is either equal to the covariate Z, or else it represents a function Zh — @h(Z) derived from the covariate Z.

We assume now that the covariate Z can be partitioned into two nonempty blocks, Z — (Zq,Zi) such that Zq — (Zoi,...,Zoq) and Z\ — (Zn,..., Zi,d-q). We shall use the following regularity conditions. Condition 2.1

(i) The conditional distribution of Zq given (Z\, Jq) has density gg(zq| zi,jo) with respect to a product dominating measure and dependent on a parameter 0 G O C Rq.

(ii) Conditionally on (Zq, Z\, Jq), the sequence m>0 forms a proportional hazard Markov chain model with parameters a — [ah : h G Eq] and ¡3 —[Ph : h G Eq].

(ii) The parameter 0 is noninformative on (a,3).

We denote by ^ — (0, a, ¡3) the unknown parameters. In Appendix 2, we give a recurrent formula for the conditional density of the the covariate Zq given the vector V, V — [N.(r), (J^Te)1^:^ ),Z1]. We also show that the "marginal" model, obtained by omitting the covariate Zq forms a non-Markovian counting process.

Here we assume that some components of this vector may be missing. Let R — (Ri,..., Rq) be a binary vector defined by

Rj — 1 if ZQj is observed Rj — 0 if ZQj is unobserved .

Then for a subject whose missing data indicator R is equal to r — (ri,..., rq), we observe the vector (V, ZQ(r)), where V — [N (t), (Te, J^^f ^, Zi] and

## How To Prevent Skin Cancer

Complete Guide to Preventing Skin Cancer. We all know enough to fear the name, just as we do the words tumor and malignant. But apart from that, most of us know very little at all about cancer, especially skin cancer in itself. If I were to ask you to tell me about skin cancer right now, what would you say? Apart from the fact that its a cancer on the skin, that is.

Get My Free Ebook

### Responses

• fiorenza
How to put a markov into a regression model?
10 months ago