Simulation of Gene Family Evolution under BDIMs of Different Degrees

In the previous section, we determined the mean time of family formation for BDIM of different degrees and found that even the shortest mean time obtained with the optimal model degree was substantially greater than the time available for genome evolution. However, for assessing the feasibility of the formation of the largest families during the evolution of real genomes, the more relevant value is not the mean but the minimum time of family formation over the enure ensemble of genes. Given the large variance of the family formation time estimates, this minimum value is likely to be much less than the mean. Analytic determination of this value is hard so we resorted to Monte Carlo simulation analysis. Model parameters estimated for human genome evolution were employed for this analysis.

The simulated evolution started from 3000 families of size one and continued until the largest family reached 1024 members (a convenient number to approximate the size of the largest family in eukaryotic genomes). The time scale was adjusted such that rju =2 X 10"8 duplications/gene/year.26 A series of simulations was performed for nonlinear rational BDIMs of different degrees.

At each discrete time step, for each family, a birth or death of a domain belonging to the family was simulated by (respectively) increasing or decreasing the family size counter; additionally, a new family of size 1 was created with the probability proportional to the innovation rate V (the resulting process is analogous to the classical model of Karlin and McGregor43). The probabilities of a birth or death for a given family of size k were, respectively, proportional to X/, and

A series of simulations with rational BDIM of different degrees was run until the largest family reached 1024 members. For the linear BDIM, the median time required to produce the first family of this size was 49.5 Ga and the mean (± standard deviation) was 52.6 ±21.1 Ga. The quadratic BDIM reached this level much faster, with the median time of 2.52 Ga and the mean of 2.64 ± 0.78 Ga. Not unexpectedly, these values are orders of magnitude smaller than the mean values estimated above.

As shown in Figure 14, the time at which the largest family in a genome reaches 1024 members depends on ¿in a similar fashion as the mean time for a single family, i.e., there is a clear minimum at a specific value of d. At the optimal value of d~ 2.2, the model reaches this family size in 2.2 ± 0.5 Ga, which is compatible with the timescale of evolution of eukary-otes. '45 Compared to the minimal evolution time predicted for a single family, the genome-size ensemble of gene families reached the threshold size much faster (by 1.5-2.5 orders of magnitude), and the optimum values of ¿was lower by -0.5 (Fig. 14).

0 0

Post a comment