Affinity propagation

In statistics and data mining, affinity propagation (AP) is a clustering algorithm based on the concept of "message passing" between data points.^[1] Unlike clustering algorithms such as $k$ -means or $k$ -medoids, AP does not require the number of clusters to be determined or estimated before running the algorithm. Like $k$ -medoids, AP finds "exemplars", members of the input set that are representative of clusters.^[1]

Algorithm

Let $x 1$ through $x n$ be a set of data points, with no assumptions made about their internal structure, and let $s$ be a function that quantifies the similarity between any two points, such that $s (x i, x j) > s (x i, x k)$ iff $x i$ is more similar to $x j$ than to $x k$ .

The algorithm proceeds by alternating two message passing steps, to update two matrices:^[1]

The "responsibility" matrix $R$ has values $r (i, k)$ that quantify how well-suited $x k$ is to serve as the exemplar for $x i$ , relative to other candidate exemplars for $x i$ .
The "availability" matrix $A$ contains values $a (i, k)$ represents how "appropriate" it would be for $x i$ to pick $x k$ as its exemplar, taking into account other points' preference for $x k$ as an exemplar.

Both matrices are initialized to all zeroes, and can be viewed as log-probability tables. The algorithm then performs the following updates iteratively:

First, responsibility updates are sent around: $r(i,k) \leftarrow s(i,k) - \max_{k' \neq k} \left\{ a(i,k') + s(i,k') \right\}$
Then, availability is updated per

a(i,k) \leftarrow \min \left( 0, r(k,k) + \sum_{i' \not\in \{i,k\}} \max(0, r(i',k)) \right)

for

i \neq k

and

a(k,k) \leftarrow \sum_{i' \neq k} \max(0, r(i',k))

Applications

The inventors of affinity propagation showed it is better for certain computer vision and computational biology tasks, e.g. clustering of pictures of human faces and identifying regulated transcripts, than $k$ -means,^[1] even when $k$ -means was allowed many random restarts and initialized using PCA.^[2] A study comparing AP and Markov clustering on protein interaction graph partitioning found Markov clustering to work better for that problem.^[3] A semi-supervised variant has been proposed for text mining applications.^[4]

Software

A Java implementation is included in the ELKI data mining framework.
Java Apro library implements parallelized Affinity Propagation and Hierarchical AP.
A Julia implementation of affinity propagation is contained in Julia Statistics's Clustering.jl package.^[5]
A Python version is part of the scikit-learn library.^[6]
An R implemenation is available in the "apcluster" package.^[7]

References

↑ 1.0 1.1 1.2 1.3 Brendan J. Frey; Delbert Dueck (2007). "Clustering by passing messages between data points". Science 315 (5814): 972–976. doi:10.1126/science.1136800. PMID 17218491.
↑ Delbert Dueck; Brendan J. Frey (2007). Non-metric affinity propagation for unsupervised image categorization. Int'l Conf. on Computer Vision.
↑ James Vlasblom; Shoshana Wodak (2009). "Markov clustering versus affinity propagation for the partitioning of protein interaction graphs". BMC Bioinformatics 10 (1): 99. doi:10.1186/1471-2105-10-99.
↑ Renchu Guan; Xiaohu Shi; Maurizio Marchese; Chen Yang; Yanchun Liang (2011). "Text Clustering with Seeds Affinity Propagation". IEEE Transactions on Knowledge & Data Engineering 23 (4): 627–637.
↑ Clustering.jl www.github.com
↑ "Clustering — scikit-learn 0.14.1 documentation". Retrieved 15 July 2014.
↑ apcluster cran.r-project.org>