Connected Component Labeling

From Wikipedia, the free encyclopedia

Connected component labeling (alternatively connected component analysis) is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled based on a given heuristic. Connected component labeling is not to be confused with segmentation.

Connected component labeling is used in computer vision to detect unconnected regions in binary digital images, although color images and data with higher-dimensionality can also be processed.^[1]^[2] When integrated into an image recognition system or human-computer interaction interface, connected component labeling can operate on a variety of information.^[3]^[4]

1 Overview
2 Algorithms
- 2.1 Two-pass
- 2.2 Others
3 See also
4 References

[edit] Overview

4-connectivity

8-connectivity

A graph, containing vertices and connecting edges, is constructed from relevant input data. The vertices contain information required by the comparison heuristic, while the edges indicate connected 'neighbors'. An algorithm traverses the graph, labeling the vertices based on the connectivity and relative values of their neighbors. Connectivity is determined by the medium; image graphs, for example, can be 4-connected or 8-connected.^[5]

Following the labeling stage, the graph may be partitioned into subsets, after which the original information can be recovered and processed.

[edit] Algorithms

The algorithms discussed can be generalized to arbitrary dimensions, albeit with increased time and space complexity.

[edit] Two-pass

Relatively simple to implement and understand, the two-pass algorithm^[6] iterates through 2-dimensional, binary data. The first pass uniquely labels non-background regions, and a second pass merges labels on any regions that are spatially connected.

The input data can be modified in situ (which carries the risk of data corruption), or labeling information can be maintained in an additional data structure.

On the first pass:

Iterate through each element of the data by column, then by row
If the element is not the background
1. Get the neighboring elements with the current element's label
2. If there are no neighbors, uniquely label the current element and continue
3. Otherwise, find the neighbors' smallest label and assign it to the current element
4. Store the equivalence between neighboring labels

On the second pass:

Iterate through each element of the data by column, then by row
If the element is not the background
1. Relabel the element with the lowest equivalent label

Here, the background is a classification, specific to the data, used to distinguish salient elements from the foreground. If the background variable is omitted, then the two-pass algorithm will treat the background as another region.

Sample graphical output from running the two-pass algorithm on a binary image. The first image is unprocessed, while the last one has been recolored with label information. Darker hues indicate the neighbors of the pixel being processed.

The pseudocode is as follows:

algorithm TwoPass(data)
   linked = []
   labels = structure with dimensions of data, initialized with the value of Background
   
   First pass
   
   for row in data:
       for column in row:
           if data[row][col] is not Background
               
               neighbours = connected elements with the current element's label
               
               if neighbours is empty
                   linked[NextLabel] = set containing NextLabel                    
                   labels[row][column] = NextLabel
                   NextLabel += 1
               
               else
                   
                   Find the smallest label
                   
                   L = neighbours' labels
                   labels[row][column] = min(L)
                   for label in L
                       linked[label] = union(linked[label], L)
   
   Second pass
   
   for row in data
       for column in row
           if labels[row][column] is not Background               
               labels[row][col] = find(labels[row][col])     
      
   return labels

The find and union algorithms are implemented as described in union find.

[edit] Others

Some of the steps present in the two-pass algorithm can be merged for efficiency, allowing for a single sweep through the image. Multi-pass algorithms also exist, some of which run in linear time relative to the number of image pixels.^[7]

In the early 1990s, there was considerable interest in parallelizing connected component algorithms in image analysis applications, due to the bottleneck of sequentially processing each pixel.^[8].

[edit] See also

[edit] References

^ H. Samet and M. Tamminen (1988). Efficient Component Labeling of Images of Arbitrary Dimension Represented by Linear Bintrees. TIEEE Trans. Pattern Anal. Mach. Intell..
^ Michael B. Dillencourt and Hannan Samet and Markku Tamminen (1992). A general approach to connected-component labeling for arbitrary image representations}. J. ACM.
^ Weijie Chen, Maryellen L. Giger and Ulrich Bick (2006). A Fuzzy C-Means (FCM)-Based Approach for Computerized Segmentation of Breast Lesions in Dynamic Contrast-Enhanced MR Images. Academic Radiology.
^ Kesheng Wu, Wendy Koegler, Jacqueline Chen and Arie Shoshani (2003). Using Bitmap Index for Interactive Exploration of Large Datasets. SSDBM.
^ R. Fisher, S. Perkins, A. Walker and E. Wolfart (2003). Connected Component Labeling.
^ Shapiro, L., and Stockman, G. (2002). Computer Vision. Prentice Hall, 69-73.
^ Kenji Suzuki and Isao Horiba and Noboru Sugie (2003). Linear-time connected-component labeling based on sequential local operations. Comput. Vis. Image Underst..
^ Yujie Han and Robert A. Wagner (1990). An efficient and fast parallel-connected component algorithm. J. ACM.

Categories: Image analysis | Computer vision

Connected Component Labeling

From Wikipedia, the free encyclopedia

Contents

[edit] Overview

[edit] Algorithms

[edit] Two-pass

[edit] Others

[edit] See also

[edit] References

Views

Navigation

Interaction

Search