## Linear Stochastic BDIM and Its Applications

The probability of formation of a family of size n starting from a family of size i before getting to extinction can be computed with the help of known formulas for the birth-and-death process to reach state n before reaching state 0.

For the linear 2nd order balanced BDIM, the probability that a singleton expands to a family of size n before dying, Pl{\,ri), is pi/, -y r(l + a)

where y= 1 + h - a, with the same power J as the equilibrium frequencies of the families.37 The values of probabilities Pl{\,n) for different species are shown in Figure 2. These probabilities are rather small,

The random birth-and-death process (2) certainly visits the state 0 in the course of time; this means that any domain family will eventually go extinct (and then formally can be "reborn", returning from the 0 class). The mean time of extinction of the largest family is an important characteristic of the evolutionary process described by these models. The plot of E1 m the mean time of extinction of the family of initial size n for the linear 2nd order balanced BDIM (measured in internal time units), versus n for different species is shown in Figure 3.

The formation time of a family of a given size was computed for the version of BDIM (2) that describes evolution of an essential gene (no 0-state). For the linear BDIM, plots of the

Figure 3. Mean time of extinction (E' J depending on family size (n) for the linear BDIM. Time is in 1/ X units. The model parameters are for D. melanogaster (blue), H. sapiens {ted), A. thaliana (green), C. elegans (purple). A color version of this figure is available online at http://www.Eurekah.com.

Figure 3. Mean time of extinction (E' J depending on family size (n) for the linear BDIM. Time is in 1/ X units. The model parameters are for D. melanogaster (blue), H. sapiens {ted), A. thaliana (green), C. elegans (purple). A color version of this figure is available online at http://www.Eurekah.com.

mean time of formation Mln (in internal time units) are shown in Figure 4. The times of formation and extinction of a family of a given size under stochastic BDIM are random variables. Thus, the questions remains how well do the mean values represent these variables. To address this issue, we calculated the variances and coefficients of variation of the extinction and formation times, s(1)jy, and 0(1)a', for the linear BDIM. It should be noticed that coefficient of variation does not depend on the model parameter X and therefore is an important and informative characteristic of the process. We found that both coefficients are very large, e.g., .^335 = 194.11 anda(1)335 = 81.79 forD. melanogaster, andi(1)n51 = 649.8 ando(1)n5i = 308.4 for H. sapiens.

Summarizing the results obtained for the stochastic characteristics of the linear BDIM, we found that, firsdy, the probability of formation of a large family from a singleton is quite small (-106 for large genomes), and, secondly, the ratio of the mean times of formation and extinction of the largest families is very large (-0.5 + 1 X 103). Thirdly, the coefficient cju is in the range of 1 + 3 for the linear model and all considered species (e.g., cju= 1.8 for Dme and cju = 2.7 for Hsa). Using the values of this coefficient and the available estimates of gene duplication rates26 to estimate the internal time unit, MX, with formula (8), gives the mean time of formation of the largest families Ml(l-,N) - 1013 - 1014 yrs, which is three to four orders of magnitude greater than the current estimate for the age of the Universe.42 Thus, the mean family formation times given by the linear BDIM would become realistic only if the recent analyses underestimated the gene duplication rate by a factor of -104, which does not seem plausible. Accordingly, the linear BDIM cannot provide an adequate description of genome evolution, at least when only the mean time of family formation is considered. As mentioned above, the coefficient of variation of the family formation time is extremely large (-100), so large deviations from the mean time, up to 2 orders of magnitude, are not improbable. At the end of this chapter (see Conclustions and Perspective), this issue is addressed with an alternative approach, namely computer simulations, which exploit the large number of families in evolving genomes and the substantial variance of the times of their formation. First, however, we consider nonlinear, higher order models that have the potential to yield faster evolution, allowing for the formation of large families observed in complex genomes.

titre

Figure 4. Mean time of formation Mxn (in 1/A. units) depending on family size (n) for the linear BDIM (in double logarithmic scale). The model parameters are for D. melanogaster (blue), H. sapiens (red), A. thaliana (green), C. elegans (purple). A color version of this figure is available online at http://www.Eurekah.com.

size

Figure 4. Mean time of formation Mxn (in 1/A. units) depending on family size (n) for the linear BDIM (in double logarithmic scale). The model parameters are for D. melanogaster (blue), H. sapiens (red), A. thaliana (green), C. elegans (purple). A color version of this figure is available online at http://www.Eurekah.com.

## Post a comment