Louvain Modularity

The Louvain Method for community detection is a method to extract communities from large networks created by Vincent Blondel.[1] The method is a greedy optimization method that appears to run in time O(n log n).

Modularity Optimization

The inspiration for this method of community detection is the optimization of Modularity as the algorithm progresses. Modularity is a scale value between -1 and 1 that measures the density of edges inside communities to edges outside communities. Optimizing this value theoretically results in the best possible grouping of the nodes of a given network, however going through all possible iterations of the nodes into groups is impractical so algorithms are used. In the Louvain Method of community detection, first small communities are found by optimizing modularity locally on all nodes, then each small community is grouped into one node and the first step is repeated.

Algorithm

The value to be optimized is Modularity, defined as a value between -1 and 1 that measures the density of links inside communities compared to links between communities.[2] Modularity is defined as:

Q = \frac{1}{2m}\Sigma_{ij}\bigg[A_{ij} - \frac{k_i k_j}{2m}\bigg]\delta (c_i,c_j)

A_{ij} represents the edge weight between nodes i and j. k_i and k_j are the degrees of node i and j respectively. m is the total number of edges in the graph. c_i and c_j are the communities of the nodes, and \delta is a simple delta function

In order to maximize this value efficiently, the Louvain Method has two phases that are repeated iteratively.

First, each node in the network is assigned to its own community. Then for each node i, the change in modularity is calculated for removing i from its own community and moving it into the community of each neighbor j of i. This value is easily calculated by:[3]

 \Delta Q = \bigg[ \frac{\Sigma_{in} + k_{i,in}}{2m} - \bigg(\frac{\Sigma_{tot} + k_i}{2m}\bigg)^2 \bigg]-\bigg[\frac{\Sigma_{in}}{2m} - \bigg(\frac{\Sigma_{tot}}{2m}\bigg)^2-\bigg(\frac{k_i}{2m}\bigg)^2\bigg]

Where \Sigma_{in} is sum of all the weights of the links inside the community i is moving into, \Sigma_{tot} is the sum of all the weights of the links to nodes in the community, k_i is the degree of i, k_{i,in} is the sum of the weights of the links between i and other nodes in the community, and m is the sum of the weights of all links in the network. Then, once this value is calculated for all communities i is connected to, i is placed into the community that resulted in the greatest modularity increase. If no increase is possible, i remains in its original community. This process is applied repeatedly and sequentially to all nodes until no modularity increase can occur. Once this local maximum of modularity is hit, the first phase has ended.

The second phase of the algorithm, it groups all of the nodes in the same community and builds a new network where nodes are the communities from the previous phase. Any links between nodes of the same community are now represented by self loops on the new community node and links from multiple nodes in the same community to a node in a different community are represented by weighted edges between communities. Once the new networks is created, the second phase has ended and the first phase can be re-applied to the new network.

Previous Uses

Comparison to Other Methods

When comparing modularity optimization methods, the two measures of importance are the speed and the resulting modularity value. A lower speed is good as it shows a method is more efficient than others and a higher modularity value is desirable as it points to having better defined communities. The compared methods are, the algorithm of Clauset, Newman, and Moore,[6] Pons and Latapy,[7] and Watika and Tsurumi.[8]

Moduarlity Optimization Comparison[9]
Karate Arxiv Internet Web nd.edu Phone Web uk-2005 Web WebBase 2001
Nodes/links 34/77 9k/24k 70k/351k 325k/1M 2.6M/6.3M 39M/783M 118M/1B
Clauset, Newman, and Moore .38/0s .772/3.6s .692/799s .927/5034s -/- -/- -/-
Pons and Latapy .42/0s .757/3.3s .729/575s .895/6666s -/- -/- -/-
Watike and Tsurmi .42/0s .761/0.7s .667/62s .898/248s .56/464s -/- -/-
Louvain Method .42/0s .813/0s .781/1s .935/3s .769/134s .979/738s .984/152mn

-/- in the table refers to a method that took over 24hrs to run. This table (from Blondel 2008) shows that the Louvain method outperforms many similar modularity optimization methods in both the modularity and the time categories.

See also

References