Digital image processing: p043 Graph Cuts
In computer vision , image segmentation is the process of partitioning a digital image into multiple segments sets of pixels , also known as image objects. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics. The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image see edge detection.
Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color , intensity , or texture. Adjacent regions are significantly different with respect to the same characteristic s. Several general-purpose algorithms and techniques have been developed for image segmentation.
To be useful, these techniques must typically be combined with a domain's specific knowledge in order to effectively solve the domain's segmentation problems. The simplest method of image segmentation is called the thresholding method. This method is based on a clip-level or a threshold value to turn a gray-scale image into a binary image.
The key of this method is to select the threshold value or values when multiple-levels are selected. Several popular methods are used in industry including the maximum entropy method, balanced histogram thresholding , Otsu's method maximum variance , and k-means clustering. Recently, methods have been developed for thresholding computed tomography CT images.
The key idea is that, unlike Otsu's method, the thresholds are derived from the radiographs instead of the reconstructed image. New methods suggested the usage of multi-dimensional fuzzy rule-based non-linear thresholds.
In these works decision over each pixel's membership to a segment is based on multi-dimensional rules derived from fuzzy logic and evolutionary algorithms based on image lighting environment and application. The K-means algorithm is an iterative technique that is used to partition an image into K clusters. In this case, distance is the squared or absolute difference between a pixel and a cluster center.
The difference is typically based on pixel color , intensity , texture , and location, or a weighted combination of these factors. K can be selected manually, randomly , or by a heuristic. This algorithm is guaranteed to converge, but it may not return the optimal solution.
The quality of the solution depends on the initial set of clusters and the value of K. Motion based segmentation is a technique that relies on motion in the image to perform segmentation. The idea is simple: look at the differences between a pair of images.
Assuming the object of interest is moving, the difference will be exactly that object. Improving on this idea, Kenney et al. They use a robot to poke objects in order to generate the motion signal necessary for motion-based segmentation. Interactive segmentation follows the interactive perception framework proposed by Dov Katz  and Oliver Brock .
Compression based methods postulate that the optimal segmentation is the one that minimizes, over all possible segmentations, the coding length of the data. The method describes each segment by its texture and boundary shape. Each of these components is modeled by a probability distribution function and its coding length is computed as follows:. For any given segmentation of an image, this scheme yields the number of bits required to encode that image based on the given segmentation.
Thus, among all possible segmentations of an image, the goal is to find the segmentation which produces the shortest coding length. This can be achieved by a simple agglomerative clustering method. The distortion in the lossy compression determines the coarseness of the segmentation and its optimal value may differ for each image.
This parameter can be estimated heuristically from the contrast of textures in an image. For example, when the textures in an image are similar, such as in camouflage images, stronger sensitivity and thus lower quantization is required.
Histogram -based methods are very efficient compared to other image segmentation methods because they typically require only one pass through the pixels.
Graph cuts in computer vision
In this technique, a histogram is computed from all of the pixels in the image, and the peaks and valleys in the histogram are used to locate the clusters in the image. A refinement of this technique is to recursively apply the histogram-seeking method to clusters in the image in order to divide them into smaller clusters. This operation is repeated with smaller and smaller clusters until no more clusters are formed. One disadvantage of the histogram-seeking method is that it may be difficult to identify significant peaks and valleys in the image.
Histogram-based approaches can also be quickly adapted to apply to multiple frames, while maintaining their single pass efficiency.
The histogram can be done in multiple fashions when multiple frames are considered. The same approach that is taken with one frame can be applied to multiple, and after the results are merged, peaks and valleys that were previously difficult to identify are more likely to be distinguishable.
The histogram can also be applied on a per-pixel basis where the resulting information is used to determine the most frequent color for the pixel location. This approach segments based on active objects and a static environment, resulting in a different type of segmentation useful in video tracking. Edge detection is a well-developed field on its own within image processing.
Region boundaries and edges are closely related, since there is often a sharp adjustment in intensity at the region boundaries. Edge detection techniques have therefore been used as the base of another segmentation technique.
The edges identified by edge detection are often disconnected. To segment an object from an image however, one needs closed region boundaries. The desired edges are the boundaries between such objects or spatial-taxons. Spatial-taxons  are information granules,  consisting of a crisp pixel region, stationed at abstraction levels within a hierarchical nested scene architecture.
They are similar to the Gestalt psychological designation of figure-ground, but are extended to include foreground, object groups, objects and salient object parts. Edge detection methods can be applied to the spatial-taxon region, in the same manner they would be applied to a silhouette.
This method is particularly useful when the disconnected edge is part of an illusory contour  . Segmentation methods can also be applied to edges obtained from edge detectors.
Lindeberg and Li  developed an integrated method that segments edges into straight and curved edge segments for parts-based object recognition, based on a minimum description length M DL criterion that was optimized by a split-and-merge-like method with candidate breakpoints obtained from complementary junction cues to obtain more likely points at which to consider partitions into different segments.
This method is a combination of three characteristics of the image: partition of the image based on histogram analysis is checked by high compactness of the clusters objects , and high gradients of their borders.
The first space allows to measure how compactly the brightness of the image is distributed by calculating a minimal clustering kmin. The bitmap b is an object in dual space. On that bitmap a measure has to be defined reflecting how compact distributed black or white pixels are. So, the goal is to find objects with good borders. Maximum of MDC defines the segmentation. Region-growing methods rely mainly on the assumption that the neighboring pixels within one region have similar values.
The common procedure is to compare one pixel with its neighbors. If a similarity criterion is satisfied, the pixel can be set to belong to the same cluster as one or more of its neighbors. The selection of the similarity criterion is significant and the results are influenced by noise in all instances. The method of Statistical Region Merging  SRM starts by building the graph of pixels using 4-connectedness with edges weighted by the absolute value of the intensity difference.
Initially each pixel forms a single pixel region. SRM then sorts those edges in a priority queue and decides whether or not to merge the current regions belonging to the edge pixels using a statistical predicate. One region-growing method is the seeded region growing method. This method takes a set of seeds as input along with the image.
Image segmentation using graph cuts ppt to pdf
The seeds mark each of the objects to be segmented. The regions are iteratively grown by comparison of all unallocated neighboring pixels to the regions. The pixel with the smallest difference measured in this way is assigned to the respective region. This process continues until all pixels are assigned to a region.
Because seeded region growing requires seeds as additional input, the segmentation results are dependent on the choice of seeds, and noise in the image can cause the seeds to be poorly placed. Another region-growing method is the unseeded region growing method.
It is a modified algorithm that does not require explicit seeds. At each iteration it considers the neighboring pixels in the same way as seeded region growing. One variant of this technique, proposed by Haralick and Shapiro ,  is based on pixel intensities. The mean and scatter of the region and the intensity of the candidate pixel are used to compute a test statistic. Otherwise, the pixel is rejected, and is used to form a new region.
It is based on pixel intensities and neighborhood-linking paths. A degree of connectivity connectedness is calculated based on a path that is formed by pixels. Split-and-merge segmentation is based on a quadtree partition of an image. It is sometimes called quadtree segmentation. This method starts at the root of the tree that represents the whole image. If it is found non-uniform not homogeneous , then it is split into four child squares the splitting process , and so on.
If, in contrast, four child squares are homogeneous, they are merged as several connected components the merging process. The node in the tree is a segmented node. This process continues recursively until no further splits or merges are possible.
Using a partial differential equation PDE -based method and solving the PDE equation by a numerical scheme, one can segment the image. The central idea is to evolve an initial curve towards the lowest potential of a cost function, where its definition reflects the task to be addressed. As for most inverse problems , the minimization of the cost functional is non-trivial and imposes certain smoothness constraints on the solution, which in the present case can be expressed as geometrical constraints on the evolving curve.
Lagrangian techniques are based on parameterizing the contour according to some sampling strategy and then evolving each element according to image and internal terms. Such techniques are fast and efficient, however the original "purely parametric" formulation due to Kass, Witkin and Terzopoulos in and known as " snakes " , is generally criticized for its limitations regarding the choice of sampling strategy, the internal geometric properties of the curve, topology changes curve splitting and merging , addressing problems in higher dimensions, etc..
Nowadays, efficient "discretized" formulations have been developed to address these limitations while maintaining high efficiency.