In this chapter, you will learn
- Marker based image segmentation using watershed algorithm
- Function: cv2.watershed()
Any gray-scale image can be regarded as a terrain surface, in which high-intensity pixels represent peaks and low-intensity pixels represent valleys. Each isolated valley (local minimum) can be filled with water (labels) of different colors. As the water level rises, the water from different valleys will obviously begin to merge and have different colors according to the nearby peaks (slopes). In order to avoid this situation, barriers should be built where water merges. Continue to fill the water and build obstacles until all the peaks are underwater. The created barrier then returns the split result. This is the "idea" behind the Watershed algorithm.
However, this method will produce over segmentation results due to noise or other irregularities in the image. Therefore, OpenCV implements a marker based watershed algorithm, which can specify which valley points to merge and which are not. This is an interactive image segmentation. What we do is give different labels to the objects we know. Mark the area we determine as the foreground or object with one color (or intensity), mark the area we determine as the background or non object with another color, and finally mark the area we are not sure with 0. This is our mark. Then watershed algorithm is applied. Then the tag will be updated with the label we give, and the boundary value of the object will be - 1.
Next, you'll see an example of how to use distance transformation and watershed to divide objects that touch each other.
Consider the coin image below, coins touching each other. Even if the threshold is set, they will contact each other.
Start by looking for an approximate estimate of the coin. Therefore, the binarization of Otsu can be used.
import cv2 import numpy from matplotlib import pyplot as plt img = cv2.imread('coins.jpg') gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU) # ret is the threshold and thresh is the result cv2.imshow('coins', thresh) cv2.waitKey(0) cv2.destroyAllWindows()
Now it is necessary to remove the white point noise in the image, and morphological expansion can be used. To remove any small holes in the object, you can use morphological etching. Therefore, it can now be determined that the area near the center of the object is the foreground and the area far from the center of the object is the background. The only area of uncertainty is the boundary area of the coin.
Therefore, it is necessary to extract an area that can be determined as a coin. Corrosion removes boundary pixels. Therefore, no matter how much is left, it is certain that it is a coin. If objects do not touch each other, it will work. However, since they are in contact with each other, another good option is to find the distance transform and apply the appropriate threshold. Next, we need to find the area where we make sure they are not coins. To this end, it is expanded, which adds the object boundary to the background. In this way, since the boundary area has been deleted, it can be ensured that any area in the background in the result is actually the background.
The remaining area is an uncertain area, whether it is a coin or a background. The watershed algorithm should find it. These areas are usually located near the coin boundary where the foreground and background meet (or even two different coins meet), which we call the boundary. From sure_ Subtract sure from BG area_ FG region.
import cv2 import numpy as np from matplotlib import pyplot as plt # noise removal kernel = np.ones((3, 3), np.uint8) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2) # sure background area sure_bg = cv2.dilate(opening, kernel, iterations=3) # finding sure foreground area dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5) ret, sure_fg = cv2.threshold(dist_transform, 0.7*dist_transform.max(), 255, 0) # finding unknow region sure_fg = np.uint8(sure_fg) unknow = cv2.subtract(sure_bg, sure_fg) plt.subplot(121) plt.imshow(dist_transform, cmap='gray') plt.title('distance transform') plt.xticks() plt.yticks() plt.subplot(122) plt.imshow(thresh, cmap='gray') plt.title('threshold') plt.xticks() plt.yticks() plt.show()
View the results. In the threshold image, some coin areas are obtained, it is determined that they are coins, and now they have been separated. (in some cases, you may only be interested in foreground segmentation rather than separating objects in contact with each other. In that case, you don't need to use distance transformation, just erosion is enough. Erosion is just another way to extract and determine the foreground area.)
It is now possible to determine which areas are coins and which are backgrounds. Therefore, we created a marker (which is the same size as the original image, but has int32 data type) and marked the area in it. Areas that must be known (whether foreground or background) are marked with any positive integer, but with different integers, while uncertain areas remain zero. To do this, use cv2.connectedComponents(). It marks the background of the image with 0, and then other objects are marked with integers starting from 1.
However, if the background is marked 0, the watershed treats it as an unknown area. So we want to mark it with different integers. Instead, mark the unknown area of the unknown definition as 0.
# Marker labelling ret, markers = cv2.connectedComponents(sure_fg) # Add one to all labels so that sure background is not 0, but 1 markers = markers + 1 # Now, mark the region of unknown with zero markers[unknow==255] = 0 plt.imshow(markers) plt.xticks() plt.yticks() plt.show
See the results shown in JET colormap. The dark blue area shows the unknown area. Of course, the colors of coins are different. The remaining areas that must be the background are displayed in lighter blue, compared with the unknown areas.
The tag is now ready. It's time for the last step to use the watershed algorithm. Then the marked image will be modified and the boundary area will be marked as - 1.
void watershed( InputArray image, InputOutputArray markers );
The first parameter image must be an 8bit 3-channel color image matrix sequence. The first parameter has nothing to say. The key is the second parameter, markers. The description of the official Opencv document is as follows:
Before passing the image to the function, you have to roughly outline the desired regions in the image markers with positive (>0) indices. So, every region is represented as one or more connected components with the pixel values 1, 2, 3, and so on. Such markers can be retrieved from a binary mask using findContours() and drawContours(). The markers are "seeds" of the future image regions. All the other pixels in markers , whose relation to the outlined regions is not known and should be defined by the algorithm, should be set to 0's. In the function output, each pixel in markers is set to a value of the "seed" components or to -1 at boundaries between the regions.
Before the watershed function watershed is executed, the second parameter markers must be processed. It should contain the contours of different regions. Each contour has its own unique number. The positioning of the contour can be realized by the findContours method in Opencv, which is the requirement before the watershed is executed.
What happens next when you execute the watershed? The algorithm will take the contour passed in by the markers as the seed (that is, the so-called water injection point), judge other pixels on the image according to the watershed algorithm rules, and delimit the regional ownership of each pixel until all pixels on the image are processed. The value at the boundary between regions is set to "- 1" to distinguish.
To sum up, the second input marker must contain the seed point information. Using the mouse to mark in the official Opencv routine is actually defining the seed, but it needs manual operation, and using findContours can automatically mark the seed points. After the watershed method is completed, the segmented image will not be generated directly, and further display processing is needed. Therefore, the watershed with only two parameters is not simple.
markers = cv2.watershed(img, markers) img[markers == -1] = [255,0,0] plt.subplot(121) plt.imshow(markers) plt.title('marker image after segmentation') plt.xticks() plt.yticks() plt.subplot(122) plt.imshow(img) plt.title('result') plt.xticks() plt.yticks() plt.show()
It can be seen from the results that for some coins, the contact area is correctly divided, while for some coins, it is not correctly divided.
import cv2 import numpy img = cv2.imread("coins.jpg") cv2.imshow("img", img) # 1. Image binarization gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU) # 2. Noise removal kernel = numpy.ones((3, 3), dtype=numpy.uint8) open = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2) # 3. Determine the background area sure_bg = cv2.dilate(open, kernel, iterations=3) # 4. Looking for promising areas dist_transform = cv2.distanceTransform(open, 1, 5) ret, sure_fg = cv2.threshold(dist_transform, 0.5 * dist_transform.max(), 255, cv2.THRESH_BINARY) # 5. Unknown area found sure_fg = numpy.uint8(sure_fg) unknow = cv2.subtract(sure_bg, sure_fg) # 6. Category marking ret, markers = cv2.connectedComponents(sure_fg) # Add 1 to all tags to ensure that the background is 0 instead of 1 markers = markers + 1 # Now let all unknown areas be 0 markers[unknow == 255] = 0 # 7. Watershed algorithm markers = cv2.watershed(img, markers) img[markers == -1] = (0, 0, 255) cv2.imshow("gray", gray) cv2.imshow("thresh", thresh) cv2.imshow("open", open) cv2.imshow("sure_bg", sure_bg) cv2.imshow("sure_fg", sure_fg) cv2.imshow("unknow", unknow) cv2.imshow("img_watershed", img) cv2.waitKey(0) cv2.destroyWindow()