Force-based algorithms
From Wikipedia, the free encyclopedia
Force-based or force-directed algorithms are a class of algorithms for drawing graphs in an aesthetically pleasing way. Their purpose is to position the nodes of a graph in two dimensional or three dimensional space so that all the edges are of more or less equal length and there are as few crossing edges as possible.
The force-directed algorithms achieve this by assigning forces amongst the set of edges and the set of nodes; the most straightforward method is to assign forces as if the edges were springs (see Hooke's law) and the nodes were electrically charged particles (see Coulomb's law). The entire graph is then simulated as if it were a physical system. The forces are applied to the nodes, pulling them closer together or pushing them further apart. This is repeated iteratively until the system comes to an equilibrium state; i.e., their relative positions do not change anymore from one iteration to the next. At that moment, the graph is drawn. The physical interpretation of this equilibrium state is that all the forces are in mechanical equilibrium.
An alternative model considers a spring-like force for every pair of nodes (i,j) where the ideal length deltaij of each spring is proportional to the graph-theoretic distance between nodes i and j. In this model there is no need for a separate repulsive force. Note that minimizing the difference (usually the squared difference) between euclidean and ideal distances between nodes is then equivalent to a metric multidimensional scaling problem. Stress majorization gives a very well-behaved (i.e. monotonically convergent) and mathematically elegant way to minimise these differences and hence find a good layout for the graph.
A force-directed graph can involve forces other than mechanical springs and electrical repulsion; examples include logarithmic springs (as opposed to linear springs) and magnetic or gravitational fields.
The results of this class of algorithm often look very good. In the case of spring-and-charged-particle graphs, the edges tend to have uniform length (because of the spring forces), and nodes that are not connected by an edge tend to be drawn further apart (because of the electrical repulsion).
While graph drawing is a difficult problem, force-directed algorithms, being physical simulations, usually require no special knowledge about graph theory such as planarity.
It is also possible to employ mechanisms that search more directly for energy minima, either instead of or in conjunction with physical simulation. Such mechanisms, which are examples of general global optimization methods, include simulated annealing and genetic algorithms.
Contents |
[edit] Advantages
The following are among the most important advantages of force-directed algorithms:
- Good quality results: at least for graphs of medium size (up to 50-100 vertices), the results obtained have usually very good aesthetic properties[citation needed]. In particular, they are good achieving the following aesthetic criteria: uniform edge length, uniform vertex distribution and showing symmetry. This last criterion is among the most important ones and is hard to achieve with any other type of algorithm.
- Flexibility: force-directed algorithms can be easily adapted and extended to fulfill additional aesthetic criteria. This makes them the most versatile class of graph drawing algorithms. Examples of existing extensions include the ones for directed graphs, 3D graph drawing, cluster graph drawing, constrained graph drawing and dynamic graph drawing.
- Intuitive: since they are based on physical analogies of common objects, like springs, the behavior of the algorithms is relatively easy to predict and understand. This is not the case with other types of graph-drawing algorithms.
- Simplicity: typical force-directed algorithms are simple and can be implemented in a few lines of code. Other classes of graph-drawing algorithms, like the ones for orthogonal layouts, are usually much more involved.
- Interactivity: another advantage of this class of algorithm is the interactive aspect. By drawing the intermediate stages of the graph, the user can follow how the graph evolves, seeing it unfold from a tangled mess into a good-looking configuration. In some interactive graph drawing tools, the user can pull one or more nodes out of their equilibrium state and watch them migrate back into position. This makes them a preferred choice for dynamic and online graph drawing systems.
- Strong theoretical foundations: while simple ad-hoc force-directed algorithms (such as the one given in pseudo-code in this article) often appear in the literature and in practice (because they are relatively easy to understand), more reasoned approaches are starting to gain traction. Statisticians have been solving similar problems in multidimensional scaling (MDS) since the 1930s and physicists also have a long history of working with related n-body problems - so extremely mature approaches exist. As an example, the stress majorization approach to metric MDS can be applied to graph drawing as described above. This has been proven to converge monotonically[1]. Monotonic convergence, the property that the algorithm will at each iteration decrease the stress or cost of the layout, is important because it guarantees that the layout will eventually reach a local minimum and stop. Damping schedules such as the one used in the pseudo-code below, cause the algorithm to stop, but cannot guarantee that a true local minimum is reached.
[edit] Disadvantages
The main disadvantages of force-directed algorithms include the following:
- High running time: the typical force-directed algorithms are generally considered to have a running time equivalent to O(V3), where V is the number of nodes of the input graph. This is because the number of iterations is estimated to be O(V), and in every iteration, all pairs of nodes need to be visited and their mutual repulsive forces computed. This is related to the N-body problem in physics. Since repulsive forces are local in nature, the graph can be partitioned such that only neighboring vertices are visited. This can improve running time to n*log(n) per iteration. Using the paper "FADE: Graph Drawing, Clustering, and Visual Abstraction" as a rough guide, in a few seconds you can expect to draw at most 1,000 nodes with a standard n² per iteration technique, and 100,000 with a n*log(n) per iteration technique.
- Poor local minima: it is easy to see that force-directed algorithms produce a graph with minimal energy, in particular one whose total energy is only a local minimum. The local minimum found can be, in many cases, considerably worse than a global minimum, which is translated into a low-quality drawing. For many algorithms, specially the ones that allow only down-hill moves of the vertices, the final result can be strongly influenced by the initial layout, that in most cases is randomly generated. The problem of poor local minima becomes more important as the number of vertices of the graph increases.
[edit] Pseudo Code
Each node has x,y position and dx,dy velocity and mass m. There is usually a spring constant, s, and damping: 0 < damping < 1. The force toward and away from nodes is calculated according to Hooke's Law and Coulomb's law or similar as discussed above.
set up initial node velocities to (0,0)
set up initial node positions randomly // make sure no 2 nodes are in exactly the same position
loop
total_kinetic_energy := 0 // running sum of total kinetic energy over all particles
for each node
net_force := (0, 0) // running sum of total force on this particular node
for each other node
net_force := net_force + Coulomb_repulsion( this_node, other_node )
next node
for each spring connected to this node
net_force := net_force + Hooke_attraction( this_node, spring )
next spring
// without damping, it moves forever
this_node.velocity := (this_node.velocity + timestep * net_force) * damping
this_node.position := this_node.position + timestep * this_node.velocity
total_kinetic_energy := total_kinetic_energy + (this_node.velocity)^2
next node
until total_kinetic_energy is less than some small number //the simulation has stopped moving
[edit] References
^ de Leeuw, J. "Convergence of the majorization method for multidimensional scaling", Journal of Classification 5(2), Springer New York, pp. 163--180, 1988
Giuseppe Di Battista, Peter Eades, Roberto Tamassia, Ioannis G. Tollis. Graph Drawing: Algorithms for the Visualization of Graphs. Prentice Hall, 1999.
Michael Kaufmann and Dorothea Wagner, editors. Drawing graphs: methods and models, volume 2025 of Lecture Notes in Computer Science. Springer-Verlag, 2001.