## Simulations study

We performed simulations to study the first type error rate of the test statistic K and to compare it to those of the statistics developed by Wei and Lachin ([WL84]) and of the Wald test statistic proposed by Wei, Lin and Weissfeld ([WLW89]) in their generalization of the proportional hazards model for multivariate failure time data. 10000 samples were generated as following: we first created a dummy variable z with value 1 for group B and with value 0 otherwise; the sample sizes were either equal (ua = ub = 50) or unequal (ua = 25, ub = 75). The marginal distributions of failure times were supposed to have the same exponential survival function in both groups. Dependence between the two failure times was governed by a Clayton-Oakes model for a strong correlation and by a Gumbel model for a slight correlation; in the first case, the joint survival distribution was

Pr{T1 >t1,T2 >t2} = {F 1(t1 )1-a + F2(t2)1-a - 1}, with a > 1, and where a ^ 1 leads to independence between the two events. We chose a = 5.

In the Gumbel model, the joint survival distribution was

Pr {T1 >t1,T2 >t2} = F 1(t1)F 2(t2) {1 + aF1(t1)F2(t2)} , for a e [-1; 1], and we let a = -1.

We then generated one single censoring time, that means C1 = C2, from an uniform distribution on [0; t], with t being chosen to obtain the desired censoring proportion.

For a weak dependence between events (Table 1), we remark that the sum of the marginal logrank statistics has a first type error rates close to 5%. The Wei and Lachin's test shows the highest first type error rate, specially in case of unequal size groups; the Wald test proposed by Wei, Lin and Weissfeld and the proposed test (9) have similar results for equal groups sizes, but the test statistic (9) resists better to imbalance between groups sizes. For a strong dependence between events (Table 2), the sum of the marginal logrank statistics is no more suitable, as expected. In case of equal size groups, Wei and Lachin's test first type error rate is higher than Wei, Lin and Weiss-feld's and the proposed test's, that have similar results, with first type error rates close to 5%. In case of unequal size groups, Wei and Lachin's test is the one with the first type error rate the more distant from 5%; the first type error rates of Wei, Lin and Weissfeld's test and of test (9) are similar with heavy censoring, but the proposed test (9) is more appropriate with no or few censoring.

Table 1. First type error rate (in %) for the two group comparison with nominal level 5% in a Gumbel model with slightly negative dependence; sample is composed of two groups A and B either with equal sizes or unequal sizes. LR: sum of the marginal logrank test statistics; WL: Wei and Lachin's test statistic; WLW: Wei, Lin and Weissfeld' Wald test statistic; K: the proposed test statistic (9).

Groups with equal sizes Groups with unequal sizes

No censoring 5.61 6.27 5.82 5.94 6.09 7.51 7.63 6.40

25% censoring 5.53 6.16 5.74 5.69 5.60 7.20 6.94 5.78

50% censoring 5.41 5.88 5.61 5.70 5.47 7.37 6.79 5.80

75% censoring 5.33 5.40 5.03 5.11 5.75 7.19 6.06 5.65

Table 2. First type error rate (in %) for the two group comparison with nominal level 5% in a Clayton-Oakes model with strong positive dependence; sample is composed of two groups A and B either with equal sizes or unequal sizes. LR: sum of the marginal logrank test statistics; WL: Wei and Lachin's test statistic; WLW: Wei, Lin and Weissfeld' Wald test statistic; K: the proposed test statistic (9).

Groups with equal sizes Groups with unequal sizes

No censoring 8.02 5.69 4.94 5.04 8.53 6.19 6.28 5.83

25% censoring 7.22 4.77 4.52 4.52 8.16 5.89 5.78 5.35

50% censoring 7.01 5.35 5.11 5.10 7.60 6.30 5.49 5.57

75% censoring 6.50 4.81 4.38 4.78 6.62 7.08 4.97 5.40