The complex network representation of different systems as networks has revealed surprising similarities, many of which are intimately tied to power laws. The simplest network measure is the average number of nearest neighbors of a node, or the average degree. However, this is a rather crude property, and to gain further insight into the topological organization of real networks, we need to determine the variation in the nearest neighbors, given by the degree distribution. For a surprisingly large number of networks, this degree distribution is best characterized by the power law functional form6 (Fig. 1A), m - k a (i)
Important examples include the metabolic network of 43 organisms,43 the protein interaction network of S. cerevisiae42 and various food webs.51 If the degree distribution instead was single-peaked (e.g., Poisson or Gaussian) as in Figure IB, the majority of the nodes would be well described by the average degree, and hence the notion of a "typical" node. In contrast for networks with a power-law degree distribution, the majority of the nodes have only one or two neighbors while coexisting with many nodes with hundreds and some even with thousands of neighbors. For these networks there exists no typical node, and they are therefore often referred to as "scale-free".
The clustering of a node, the degree to which the neighborhood of a node resembles a complete subgraph, is another measure which sheds light on the structural organization of a network.66 For a node i with degree k, the clustering is defined as, representing the ratio of the number of actual connections between the neighbors of node i to the number of possible connections. For a node which is part of a fully interlinked cluster
Figure 2. Properties of the metabolic network of Escherichia coli. A) The degree distribution displays a power law in both the in- and the out degrees.43 B) The clustering coefficient varies with k as a power law. The solid line corresponds to k'x. C) Three dimensional representation of the reduced metabolic network.56
Figure 2. Properties of the metabolic network of Escherichia coli. A) The degree distribution displays a power law in both the in- and the out degrees.43 B) The clustering coefficient varies with k as a power law. The solid line corresponds to k'x. C) Three dimensional representation of the reduced metabolic network.56
Cj= 1, while Q = 0 for a node which acts as a bridge between different clusters. Accordingly, the overall clustering coefficient of a network with N nodes is given by (C) = X Q/N, and represents a measure of a networks potential modularity. By studying the clustering of nodes with a given degree k, information about the actual modular organization of a network can be gleaned:16'56'57,63 For all metabolic networks available, this behaves like the power law,
suggesting the existence of a hierarchy of nodes with different degrees of modularity (as measured by the clustering coefficient) overlapping in an iterative manner.56 In Figure 2, we show the degree distribution (Fig. 2A) and the clustering as function of k (Fig. 2B) for the bacterium Escherichia coli. They both clearly adhere to a power-law behavior, suggesting that biological networks are both scale-free and hierarchical. Panel 2C is a three dimensional representation of a cleaned up version of the metabolic network,56 demonstrating that modules are not clearly separated. Furthermore, the likelihood that a node appears in the shortest paths between other nodes on the network, the so-called betweenness-centralityg,26'29 is also characterized by a power law distribution following P(g) - for both biological and nonbiological networks,31 suggesting that a few nodes act as bridges or linkers between the different parts of the network. In summary, we have seen strong evidence that biological networks are both scale-free42'43 and hierarchical.56
An important question now arisesâ€”we can characterize networks using the above mentioned quantities, but why is the power law behavior so pervasive? Several models building on very different principles are able to explain these observed features.
Figure 3. Graphical representation of three network models: A,D) The ER (random) model, B,E) the BA (scale-free) model and (C) and (F) the hierarchical model. The random network model is constructed by starting from ./V nodes before the possible node-pairs are connected with probability p. Panel (A) shows a particular realization of the ER model with 10 nodes and connection probability p = 0.2. In panel (B) we show the scale-free model at time t (green links) and at time (t + 1) when we have added a new node (red links) using the preferential attachment probability (see Eq. (4)). Panel (C) demonstrates the iterative construction of a hierarchical network, starting from a fully connected cluster of four nodes (blue). This cluster is then copied three times (green) while connecting the peripheral nodes of the replicas to the central node of the starting cluster. By once more repeating this replication and connection process (red nodes), we end up with a 64-node scale-free hierarchical network. In panel (D) we display a larger version of the random network, and it is evident that most nodes have approximately the same number of links. For the scale-free model, (E) the network is clearly inhomogeneous: while the majority of nodes has one or two links, a few nodes have a large number of links. We emphasize this by coloring the five nodes with the highest number of links red and their first neighbors green. While in the random network only 27% of the nodes are reached by the five most connected nodes, we reach more than 60% of the nodes in the scale-free network, demonstrating the key role played by the hubs. Note that the networks in (D) and (E) consist of the same number of nodes and links. Panel (F) demonstrates that the standard clustering algorithms are not that successful in uncovering the modular structure of a scale-free hierarchical network. A color version of this figure is available online at http://www.Eurekah.com.
While graph theory initially focused on regular graphs, since the 1950s large networks with no apparent design principles were described as random graphs,8 proposed as the simplest and most straightforward realization of a complex network. According to the Erdos-Renyi (ER) model of random networks,22 we start with TV nodes and connect every pair of nodes with probability p, creating a graph with approximatelypN(N-l)/2 randomly distributed edges (Fig. 3A,D). For this model the degrees follow a Poisson distribution (Fig. 4A), and as a consequence, the average degree (k) of the network describes the typical node. Furthermore, for this "democratic" network
Figure 4. Properties of the three network models. A) The ER model sports a Poisson degree distribution P(k) (the probability that a randomly selected node has exactly k links) which is strongly peaked at the average degree (k ) and decays exponentially for large k. The degree distributions for the scale-free (B) and the hierarchical (C) network models do not have a peak, they instead decay according to the power-law P(k) - k "1. The average clustering coefficient for nodes with exactly k neighbors, C(k), is independent of k for both the ER (D) and the scale-free (E) network model. F) In contrast, C(k) - k~l for the hierarchical network model (cf. Fig. 2).
Figure 4. Properties of the three network models. A) The ER model sports a Poisson degree distribution P(k) (the probability that a randomly selected node has exactly k links) which is strongly peaked at the average degree (k ) and decays exponentially for large k. The degree distributions for the scale-free (B) and the hierarchical (C) network models do not have a peak, they instead decay according to the power-law P(k) - k "1. The average clustering coefficient for nodes with exactly k neighbors, C(k), is independent of k for both the ER (D) and the scale-free (E) network model. F) In contrast, C(k) - k~l for the hierarchical network model (cf. Fig. 2).
model, the clustering is independent of the node degree k (Fig. 4D). As we have just seen in Figure 2, the ER model does not capture the properties of biological networks.
In the network model of Barabasi and Albert (BA), two crucial mechanisms, which both are absent from the classical random network model, are responsible for the emergence of a power-law degree distribution.6 First, networks grow through the addition of new nodes linking to nodes already present in the system. Second, there is a higher probability to link to a node with a large number of connections in most real networks, a property called preferential attachment. These two principles are implemented as follows: starting from a small core graph consisting of mo nodes, a new node with m links is added at each time step and connected to the already existing nodes (Fig. 3B,E). Each of the m new links are then preferentially attached to a node i (with k, neighbors) which is chosen according to the probability n,-=v5>> (4)
The simultaneous combination of these two network growth rules gives rise to the observed power-law degree distribution (Fig. 4B). In panel 3B, we illustrate the growth process of the scale-free model by displaying a network at time t (green links) and then at time (t + 1), when we have added a new node (red links) using the preferential attachment probability. Compared to random networks, the probability that a node is highly connected is statistically significant in scale-free networks. Consequently, many network properties are determined by a relatively small number of highly connected nodes, often called "hubs". To make the effect of the hubs on the network structure visible, we have colored the five nodes with largest degrees red in Figure 3D,E and their nearest neighbors green. While in the ER network only 27% of the nodes are reached by the five most connected ones, we reach more than 60% of the nodes in the scale-free network, demonstrating the key role played by the hubs. Another consequence of the hub's dominance of the network topology is that scale-free networks are highly tolerant of random failures (perturbations) while being extremely sensitive to targeted attacks.3 Comparing the properties of the BA network model with those of the ER model, we note that the clustering of the BA network is larger, however C(k)is approximately constant (Fig. 4E), indicating the absence of a hierarchical structure.
Many real networks are expected to be fundamentally modular, meaning that the network can be seamlessly partitioned into a collection of modules where each module performs an identifiable task, separable from the function(s) of other modules.33'34'37,46'55,60 Therefore, we must reconcile the scale-free property with potential modularity. In order to account for the modularity as reflected in the power-law behavior of Figure 2B and a simultaneous scale-free degree distribution Figure 2A, we have to assume that clusters combine in an iterative manner, generating a hierarchical network.56,63 Such a network emerges from a repeated duplication and integration process of clustered nodes,56 which in principle can be repeated indefinitely. This process is depicted in panel 3c, where we start from a small cluster of four densely linked nodes (blue). We next generate three replicas of this hypothetical initial module (green) and connect the three external nodes of the replicated clusters to the central node of the old cluster, thus obtaining a large 16-node module. Subsequently, we again generate three replicas of this 16-node module (red), and connect the 16 peripheral nodes to the central node of the old module, obtaining a new module of 64 nodes. This hierarchical network model seamlessly integrates a scale-free topology with an inherent modular structure by generating a network that has a power law degree distribution (Fig. 4C) with degree exponent y= 1 + In4/ln3 = 2.26 and a clustering coefficient C(k) which proves to be dependent on k'1 (Fig. 4F). However, note that modularity does not imply clear-cut sub-networks linked in well-defined ways.36,56 In fact, the boundaries of modules are often blurred (see Fig. 3F), bridged by highly connected nodes which interconnect modules.
Was this article helpful?