Graph cuts in computer vision

As applied in the field of computer vision, graph cuts can be employed to efficiently solve a wide variety of low-level computer vision problems (early vision[1]), such as image smoothing, the stereo correspondence problem, and many other computer vision problems that can be formulated in terms of energy minimization. Such energy minimization problems can be reduced to instances of the maximum flow problem in a graph (and thus, by the max-flow min-cut theorem, define a minimal cut of the graph). Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph (e.g., normalized cuts), the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as graph partitioning algorithms).

"Binary" problems (such as denoising a binary image) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a grayscale image) cannot be solved exactly, but solutions produced are usually near the global optimum.

History

The theory of graph cuts was first applied in computer vision in the seminal paper by Greig, Porteous and Seheult[2] of Durham University. In the Bayesian statistical context of smoothing noisy (or corrupted) images, they showed how the maximum a posteriori estimate of a binary image can be obtained exactly by maximizing the flow through an associated image network, involving the introduction of a source and sink. The problem was therefore shown to be efficiently solvable. Prior to this result, approximate techniques such as simulated annealing (as proposed by the Geman brothers[3]), or iterated conditional modes (a type of greedy algorithm as suggested by Julian Besag)[4] were used to solve such image smoothing problems.

Although the general k-colour problem remains unsolved for k > 2, the approach of Greig, Porteous and Seheult[2] has turned out[5][6] to have wide applicability in general computer vision problems. Greig, Porteous and Seheult approaches are often applied iteratively to a sequence of binary problems, usually yielding near optimal solutions.

Notations

Existing methods

  1. First step optimizes over the color parameters using K-means.
  2. Second step performs the usual graph cuts algorithm.
These 2 steps are repeated recursively until convergence.

Energy function

Pr(x|S) = K(-E) where the energy E is composed of 2 different models (E_{\rm color} and E_{\rm coherence}):

Likelihood / Color model / Regional term

E_{\rm color} — unary term describing the likelihood of each color.

Histogram

GMM (Gaussian Mixture Model)

Texon

  1. Determine a good natural scale for the texture elements.
  2. Compute non-parametric statistics of the model-interior texons, either on intensity or on Gabor filter responses.

Prior / Coherence model / Boundary term

E_{\rm coherence} — binary term describing the coherence between neighborhood pixels.

References

Criticism

Graph cuts methods have become popular alternatives to the level set-based approaches for optimizing the location of a contour (see[7] for an extensive comparison). However, graph cut approaches have been criticized in the literature for several issues:

Algorithm

Implementation

The Wikibook Algorithm Implementation has a page on the topic of: Graphs/Maximum flow/Boykov & Kolmogorov

Boykov & Kolmogorov[15] published an efficient way to compute the max-flow for computer vision related graph.

Software

References

  1. Adelson, Edward H., and James R. Bergen (1991), "The plenoptic function and the elements of early vision", Computational models of visual processing 1.2 (1991).
  2. 2.0 2.1 D.M. Greig, B.T. Porteous and A.H. Seheult (1989), Exact maximum a posteriori estimation for binary images, Journal of the Royal Statistical Society Series B, 51, 271–279.
  3. D. Geman and S. Geman (1984), Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Mach. Intell., 6, 721–741.
  4. J.E. Besag (1986), On the statistical analysis of dirty pictures (with discussion), Journal of the Royal Statistical Society Series B, 48, 259–302
  5. Y. Boykov, O. Veksler and R. Zabih (1998), "Markov Random Fields with Efficient Approximations", International Conference on Computer Vision and Pattern Recognition (CVPR).
  6. 6.0 6.1 Y. Boykov, O. Veksler and R. Zabih (2001), "Fast approximate energy minimisation via graph cuts", IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1222–1239.
  7. Leo Grady and Christopher Alvino (2009), "The Piecewise Smooth Mumford-Shah Functional on an Arbitrary Graph", IEEE Trans. on Image Processing, pp. 2547-2561
  8. Yuri Boykov and Vladimir Kolmogorov (2003), "Computing Geodesics and Minimal Surfaces via Graph Cuts", Proc. of ICCV
  9. Ben Appleton and Hugues Talbot (2006), "Globally Minimal Surfaces by Continuous Maximal Flows", IEEE Trans. on Pattern Analysis and Machine Intelligence, pp. 106-118
  10. Ali Kemal Sinop and Leo Grady, "A Seeded Image Segmentation Framework Unifying Graph Cuts and Random Walker Which Yields A New Algorithm", Proc. of ICCV, 2007
  11. Vladimir Kolmogorov and Yuri Boykov (2005), "What Metrics Can Be Approximated by Geo-Cuts, or Global Optimization of Length/Area and Flux", Proc. of ICCV pp. 564-571
  12. Nicolas Lermé, François Malgouyres and Lucas Létocart (2010), "Reducing graphs in graph cut segmentation", Proc. of ICIP, pp. 3045-3048
  13. Herve Lombaert, Yiyong Sun, Leo Grady, Chenyang Xu (2005), "A Multilevel Banded Graph Cuts Method for Fast Image Segmentation", Proc. of ICCV, pp. 259-265
  14. Yin Li, Jian Sun, Chi-Keung Tang, and Heung-Yeung Shum (2004), "Lazy Snapping", ACM Transactions on Graphics, pp. 303-308
  15. Yuri Boykov, Vladimir Kolmogorov: An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. IEEE Trans. Pattern Anal. Mach. Intell. 26(9): 1124-1137 (2004)

Further reading