Figure 7. Distribution of number of promoters vs number of different TF binding sites in Saccbaromyces cerevisiae. This is a log-log plot of data taken directly from Lee et al.36 The red lines represents a least-squares fit to a power law. The resulting power-law exponent is 2.44 as indicated. A color version of this figure is available online at

the cell, because while other information is fed into this network by signaling from outside the cell, by coupling to the metabolic networks and by other proteins interacting with the transcription complex, the transcription factor network provides the sequence recognition mechanisms that processes all external information. The manner in which binding data implies a network is illustrated in Figure 8A If we restrict our attention only to the transcription factors that form a single connected graph (no isolated nodes or pieces of two or three factors unconnected to the net) we can get a better view of the interlinking of the functional categories. This connected graph is shown in the figure. The directed graph representing this network depicts well the information derived from the experiments, but does not represent the network in detail. For example, no information about whether a given factor represses or activates a gene when it binds a given region, or the strength of such effects, is represented. Nonetheless, the basic structure of the graph, sometimes referred to as its "topological" structure, is well represented. Bear in mind that the graph in the figure is inherendy a directed graph because the factors bind the regulatory regions of other's genes, giving a direction to each link.

Since this network includes a large number of transcription factors and is the most extensive such network known to date, we will examine it further, recognizing that it is incomplete in several different ways. The most important shortcoming is that genetic regulatory interactions can occur in ways other than just by transcription factor binding to cis-acting regulatory regions. The processing and translation of transcripts is often regulated, the binding of the TF proteins themselves may be regulated by modifications and protein cofactors that bind transcription factors, but do not bind DNA sites themselves also play important roles in regulation. The TF network by itself is only a part of the active regulatory network, albeit an important and well defined part. This network illustrates the strong degree of inter-linkage of different functional categories ofTFs. While the Environmental Response category (green nodes) has some significant links within the category (see the tight coupling ofYap6, Roxl and Cin5, for example) they are all direcdy linked to all other categories except the Development Processes one. The Cell Cycle category is coupled in multiple ways to all of the others. The Yap6 couplings just mentioned appear to be the only examples of mutual linkage between pairs of factor genes in this network. Each of the factors in such a pair (Yap6 - Cin5, and Yap6-Roxl) bind to the regulatory region of the others gene, thereby creating a direct feedback loop of some kind. There are also several instances of self-regulation (Rapl, Rcsl, Nrgl, Yap6, Smpl and Swi4). A glance at the graph in the figure reveals a major feature seen in most biological networks studied to date—the existence of a number of highly linked "hubs". For example, Sfll, Abfl, Swi4, Swi5, Fkh2, Phdl and Rapl, all have 5 or more links to other factors. The existence of hubs is a feature of a more general, mathematical property of many large networks as described in a previous section.

If we examine the distribution of connections ofTF network, that is the number of nodes with k connections as a function of k, we must consider both the "in" connections as well as the "out" connections of these nodes. These form two distinct distributions, albeit with similar exponents. If we compare these distributions with the overall "in" distribution for the whole network (the nontranscription factor genes clearly have only "in" connections) we see that they are also power laws, but they have a different exponent (Fig. 8B). A smaller exponent indicates more highly connected nodes in the network than a higher exponent. One qualitative conclusion, then, is simply that the transcription factor network (the "core" network, as we are calling it) is more highly connected (in this statistical sense) than the other, peripheral, genes that it regulates. The conclusion has some intutitive appeal—the regulatory circuit, the computer, is more highly connected than the downstream output network of linkages to the "effectors" (Fig. 8C). It is also interesting that the "in" and "out" distributions of the network look to be the same.

Other genomes have now been sequenced and it is possible to make some preliminary comparisons between the yeast network and some others. While the transcription circuitry has

DMA/RNAyPralfiiil Biûeysithesis Environmental RffspçKiw Mataboliam Call Cyclo

DsratopmentBt Pioççsses

1 10 100 Numher ot Regulators

0 0

Post a comment