Betweenness centrality is a measure of a node's centrality in a network equal to the number of shortest paths from all vertices to all others that pass through that node. Betweenness centrality is a more useful measure of the load placed on the given node in the network as well as the node's importance to the network than just connectivity. The latter is only a local effect while the former is more global to the network. Development of betweenness centrality is generally attributed to sociologist Linton Freeman, who has also developed a number of other centrality measures.[1] (The same idea was also earlier proposed by mathematician J. Anthonisse, but his work was never published.[1])
Contents |
The betweenness centrality of a node is given by the expression:
where total number of shortest paths from node to node and is the number of those paths that pass through .
Note that the betweenness centrality of a node scales with the number of pairs of nodes as implied by the summation indices. Therefore the calculation may be rescaled by dividing through by the number of pairs of nodes not including , so that . The division is done by for directed graphs and for undirected graphs, where is the number of nodes in the giant component. Note that this scales for the highest possible value, where one node is crossed by every single shortest path. This is often not the case, and a normalization can be performed without a loss of precision
which results in:
Note that this will always be a scaling from a smaller range into a larger range, so no precision is lost.
It has been shown that the load distribution of a scale free network follows a power law given by a load exponent ,[2]
this implies the scaling relation to the degree of the node,
Where is the average load of vertices with degree . The exponents and are not independent since equation (1) implies [3]
For large g , and therefore large k , the expression becomes
which proves the following equality:
The important exponent appears to be which describes how the betweenness centrality depends on the connectivity. The situation which maximizes the betweenness centrality for a vertex is when all shortest paths are going through it, which corresponds to a tree structure (a network with no clustering). In the case of a tree network the maximum value of is reached.[3]
This maximal value of (and hence minimum of ) puts bounds on the load exponents for networks with non-vanishing clustering.
In this case, the exponents are not universal and depend on the different details (average connectivity, correlations, etc.)
Real world scale free networks, such as the internet, also follow a power law load distribution.[4] This is an intuitive result. Scale free networks arrange themselves to create short path lengths across the network by creating a few hub nodes with much higher connectivity than the majority of the network. These hubs will naturally experience much higher loads because of this added connectivity.
In a weighted network the links connecting the nodes are no longer treated as binary interactions, but are weighted in proportion to their capacity, influence, frequency, etc., which adds another dimension of heterogeneity within the network beyond the topological effects. A node's strength in a weighted network is given by its degree multiplied by the sum of its link's weights,
With and being adjacency and weight matricies between nodes and , respectively. Analogous to the power law distribution of degree found in scale free networks, the strength of a given node follows a power law distribution as well.
A study of the average value of the strength for vertices with betweenness shows that the functional behavior can be approximated by a scaling form [5]
Calculating the betweenness and closeness centralities of all the vertices in a graph involves calculating the shortest paths between all pairs of vertices on a graph. This takes time with the Floyd–Warshall algorithm, modified to not only find one but count all shortest paths between two nodes. On a sparse graph, Johnson's algorithm may be more efficient, taking time. On unweighted graphs, calculating betweenness centrality takes time using Brandes' algorithm.[6]
In calculating betweenness and closeness centralities of all vertices in a graph, it is assumed that graphs are undirected and connected with the allowance of loops and multiple edges. When specifically dealing with network graphs, oftentimes graphs are without loops or multiple edges to maintain simple relationships (where edges represent connections between two people or vertices). In this case, using Brandes' algorithm will divide final centrality scores by 2 to account for each shortest path being counted twice.[6]