## Hazards Regression

The two-sample transformation model can be viewed as a special case of the linear transformation model: H(T) = —pz + ae, or, after reparameterization, aH(T) = —pz + e. Taking H(t) = logA(t) and Fe(t) = 1 — e-e results in the 'a-proportional hazards model' (Hsieh [HSI96c]):

When p and a are further expressed as p =exp(pTz) and a =exp(YTx) for two sets of p- and ^-vectors z and x, model (9) evolves into

in terms of the cumulative hazard; or (when z and x are time-fixed)

in terms of hazard function. When 7 = 0, (11) reduces to the Cox's proportional hazards (PH) model (Cox [COX72]). The covariates in model (11) can be made time dependent:

The specific transformation model (12) are termed in Hsieh [HSI01] as the heteroscedastic hazards regression model, and will be called hereafter the Hsieh model. Different specific models corresponding to different transforms are listed in Hsieh [HSI95, page 741].

The heterogeneity property investigated in this paper can be explored through model (11) for x = z: Taking the one-dimensional case as an example, the log-relative risk (=logRR(t)) between strata Zj versus Zi (zj — zi = 1) is log{RR(t)} = (eYZ — e'Zi )logAo(t) + (3 + j). (13)

This one-dimensional case illustrates the ordinary interpretation that: for a multiple regression setting, the coefficient of Z corresponds to a unit-change of 'log-hazard' in Z while the other covariates remains fixed. If logRR(t) is the 'effect' of concern, (13) implies the effect is not only time-dependent, but also depends on the z-value. The 'time-dependence' is preferred to be called as nonconstancy, and the dependence on z-value to be as heterogeneity, which has the same meaning explained in the logistic regression example introduced in Section 1. The coexistence of nonconstancy and heterogeneity can be viewed as an 'interaction' between the heteroscedasticity component and the underlying hazard. Figure 1 gives examples of (13) in which the noncon-stancy and heterogeneity properties of the Hsieh model is explored by plots of logRR(t) for the spectrum of zj = —5, —4,. ..,5, and 7 = 0.1 (Fig.1(a)), 0.2 (Fig.1(b)), 0.3 (Fig.1(c)), and 0.5 (Fig.1(d)); 3 = 1 for all cases. When the heteroscedasticity parameter is small (7 = 0.1, Fig.1(a)), the log-relative risk basically looks like a constant in time and they also coincide for much of the time t e (0, 2). From Figures 1(a) to (d), only z = 5 is plotted by a solid line to present the trend of log(RR)-plot in z. For larger 7, time-dependence of logRR(t) is clearer; moreover, for a fixed t, the 'effect' is different for different z's, which reveals larger heterogeneity. The selection of covariate vectors x and z is quite flexible: they can have a shared subset of variables (Wu, Hsieh, and Chen [WHC02]).

Estimation procedures proposed in Hsieh [HSI01] starts with a construction of the estimating equations. Let (Ti,Si, Zi, Xi) be independent samples of failure time, censoring indicator, and covariate vectors, i = 1,...,n, where (without loss of generality) T1 < ... <Tn be failure or right-censored times. Fig 1(a):Hsieh model, gamma=0.1

Fig 1(a):Hsieh model, gamma=0.1

Fig 1(b):Hsieh model, gamma=0.2 Fig 1(c):Hsieh model, gamma=0.3

Fig 1(b):Hsieh model, gamma=0.2 Fig 1(d):Hsieh model, gamma=0.5

Fig 1(c):Hsieh model, gamma=0.3

Fig 1(d):Hsieh model, gamma=0.5

Fig. 1. Heterogeneous log-relative risks in the Hsieh model.

We denote Ni(t) = 1{Ti<t,gi=i} and Yi(t) to be the counting process and at-risk indicator of individual i. Further let Vj(t) = Xj(t){1 + eY Xi logAo(t)} fi = (M)T, and

IT Zi+YT Xi

i=i where Ki(t) = 1, Zi(t), or Vi(t). The estimating equation processes constructed in Hsieh [HSI01] are

Mi(t)= El { SdNiAU)fi) - Ao(u)dn}, ^ Jo Si(u; Ao,0)

where t G (0,uj), for a maximal truncation time uj (defined below). Setting M1(t)=0 leads to

Define the elements of the 3 x 3 matrix of covariation process A as

Under several regularity conditions, Hsieh [HSI01] have the following property of (M1,M2, M3)(t)T. (I) The process M1 is orthogonal to M2 and M3 in the sense that the covariation process < M1, M2 >t=< M1, M3 >t = 0. (II) The system of martingales M(t) = (M1,M2,M3)T(t) converge weakly to W(t) = (W1,W2,W3)T(t), which is a system of Gaussian processes with independent increments. The components of covariance of W(t) are Aik(t),i,k = 1, 2, 3. Moreover, by (I), < W1 ,W2 >t=< W1,W3 >t= 0. Conventional notations about the counting process model can be found in Andersen et al. [ABGK93] The above limiting process W(t) leads to the construction of an approximated likelihood: Let G(t) be a stochastic process and A(J•'G(u) = (G(u1) — G(u0), G(u2) — G(u1),..., G(uJ) — G(uJ-1))T, where G(u0) = G(0) = 0 and G(uJ) = G(tn). Also, denotes AiG = G(ui) — G(ui-1). By choosing suitable cutoff points u = (u1,u2,... ,uj)t, an approximation of the likelihood due to the independent-increment property of W(t) can be obtained: log-likelihood= — 2EJ(AiM)T(AiA)-1(AiM) = L(J). Note that, by the orthogonality of W1 and (W2, W3)T, (AiA)-1 = diag((AiAn)-1, (AiA(n))-1), where A(n) is the submatrix of A deleting the first column and the first row. One can make statistical inference for 9 and Ao based on the approximated likelihood. Before doing this, Hsieh [HSI01] also introduces a piecewise-constant approximation to Ao(t): Aq1 )(t) = JqJ2j ai1{ui-1<u<ui}du, where 0 < ai < to and J = O(n 3). The score functions of parameters 9 and a = (a1,...,aj)T are: Ep = dL(J), E7 = dL(J), and Ea = daL(J}. The estimated parameters of interest have the asymptotics:

jn(9 — 9) D N(0, {lim[£do — S9aS-1Sea]}-1), (18)

where ^{.}'s are the information matrices pertaining to the associated param-

eters. In addition, AQ ' can have a yfn- weak convergence. If the heteroscedas-ticity part, yTx, is neglected, the estimate of ¡3 will be biased. (Wu [WU04a]).

The quadratic L(J) (omitting ' — 2') can be further decomposed as: L(J) = LJ) + L{2J), where

The part l2,J) contains much of the information of the parameters of interest 0 and y, and can be viewed as a projection of L(J) onto the space of (0Tz, yTx). Similar to the expression in Section 2.2, let LJ) = Qg + Qi, where

Qg = ]T(AiU2, AiM3)(Ai^{n))-1(AiM2, AiM3)T ~ x\P+q)jj-1)

can be used as a test statistic for global model validity (i.e. the Hsieh model), and

Qi = ]T{Ai(M2-M2,M3-M3)}(AiA(ii))-1{Ai(M2-M2,M3-M3)}T ~ xp+q is used to test for a local hypothesis: H0 : 0 = 00. For examples, the proportional hazards (PH) assumption can be checked under the nested family of the PH model (H0 : y = 0) within the Hsieh model (Ha : y = 0) (Wu et al. [WHC02]); or, the equal-distribution null hypothesis (H0 : 0 = y = 0) can also be tested within the Hsieh model (Ha : 0 = 0 or y = 0) (Wu [WU04b]).