Surprise (networks)

Surprise (denoted S) is a global measure of the quality of a partition of a given complex network into communities. The name Surprise derives from the fact that its value evaluates how surprising (unlikely) a given partition is, from a statistical point of view. Using benchmarks with networks with known community structure, it has been shown that Surprise maximization is a very effective way to determine the communities present in the networks.

Definition

Formula

Given a partition into communities, Surprise compares the number of links within and between communities in that partition with the expected number of them in a random network with the same distribution of nodes per communities. Thus, S evaluates, at the same time, both the number of nodes and links.

The formula of Surprise for a given partition is:


S = -\log \sum_{j=p}^{\min(M,n)}\frac{\binom{M}{j}\binom{F - M}{n - j}}{\binom{F}{n}}

where F is the maximum possible number of links in the network for the number of nodes k


F = \frac{k(k-1)}{2}

M is the maximum possible number of intra-community links. Let C be the number of communities, then


M = \sum_{i=1}^{C} \frac{k_i(k_i-1)}{2}

n is the actual number of links in the network and p is the actual number of intra-community links of that partition.

Properties

  • All nodes are joined into a single community
  • Each node has its own community