 The main contents of this chapter include the following:
 Geometric transformation of images
 Morphological Conversion of Images
 Image Smoothing Method
 Histogram method
 Edge detection methods
 Application of Template Matching and Hoff Transform
1 Geometric transformation
 Learning Objectives
 Master image scaling, translation, rotation, etc.
 Understanding affine and transmission transformations of digital images
1.1 Image Scaling
Zooming adjusts the size of an image, even if it is zoomed in or out.
API
cv2.resize(src,dsize,fx=0,fy=0,interpolation=cv2.INTER_LINEAR)
Parameters:
import cv2 as cv # 1. Read pictures img1 = cv.imread("./image/dog.jpeg") # 2. Image Scaling # 2.1 Absolute size rows,cols = img1.shape[:2] res = cv.resize(img1,(2*cols,2*rows),interpolation=cv.INTER_CUBIC) # 2.2 Relative size res1 = cv.resize(img1,None,fx=0.5,fy=0.5) # 3 Image Display # 3.1 Use opencv to display images (not recommended) cv.imshow("orignal",img1) cv.imshow("enlarge",res) cv.imshow("shrink)",res1) cv.waitKey(0) # 3.2 Displaying images using matplotlib fig,axes=plt.subplots(nrows=1,ncols=3,figsize=(10,8),dpi=100) axes[0].imshow(res[:,:,::1]) axes[0].set_title("Absolute scale (magnification)") axes[1].imshow(img1[:,:,::1]) axes[1].set_title("Original Map") axes[2].imshow(res1[:,:,::1]) axes[2].set_title("Relative scale (reduced)") plt.show()
1.2 Image Shift
Image panning moves the image to the appropriate location in the specified direction and distance.
 API
cv.warpAffine(img,M,dsize)
Parameters:
Example
The requirement is the distance at which the pixel points of the image are moved (50,100):
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1. Read images img1 = cv.imread("./image/image2.jpg") # 2. Image Shift rows,cols = img1.shape[:2] M = M = np.float32([[1,0,100],[0,1,50]])# translation matrix dst = cv.warpAffine(img1,M,(cols,rows)) # 3. Image display fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100) axes[0].imshow(img1[:,:,::1]) axes[0].set_title("Original Map") axes[1].imshow(dst[:,:,::1]) axes[1].set_title("Posttranslational results") plt.show()
1.3 Image Rotation
Image rotation refers to the process by which an image rotates at an angle in accordance with a certain position while still maintaining its original size. The horizontal symmetric axis, vertical symmetric axis and center coordinate origin of the image may be transformed after the image is rotated, so the coordinates in the image rotation need to be transformed accordingly.
How does the image rotate? As shown in the following figure:
Suppose the image rotates counterclockwise theta θ， The rotation is converted to:
Where:
image20191023103038145
In the formula above, there are:
It can also be written as:
At the same time, we need to correct the position of the origin, because the coordinate origin in the original image is in the upper left corner of the image, the size of the image will change after rotation, and the origin also needs to be corrected.
Assuming that the center of rotation is the origin of the coordinates when rotating, the origin of the coordinates needs to be moved to the upper left corner of the image after rotating, that is, a transformation needs to be made.
In OpenCV, image rotation first obtains the rotation matrix based on the rotation angle and center, and then transforms it according to the rotation matrix to achieve the rotation effect of any angle and center.
 API
cv2.getRotationMatrix2D(center, angle, scale)
Parameters:
center: Center of rotation angle: Rotation angle scale: Scale
Return:
M: Rotation Matrix call cv.warpAffine Complete image rotation
Example
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1 Read image img = cv.imread("./image/image2.jpg") # 2 Image Rotation rows,cols = img.shape[:2] # 2.1 Generating a rotation matrix M = cv.getRotationMatrix2D((cols/2,rows/2),90,1) # 2.2 Rotation transformation dst = cv.warpAffine(img,M,(cols,rows)) # 3 Image Display fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100) axes[0].imshow(img1[:,:,::1]) axes[0].set_title("Original Map") axes[1].imshow(dst[:,:,::1]) axes[1].set_title("Result after rotation") plt.show()
1.4 Affine Transformation
Affine transformation of an image involves the change of the shape, position and angle of the image. It is a common function in deep learning preprocessing. Affine transformation is mainly a combination of scaling, rotation, flipping and translation of the image.
So what is an affine transformation of an image? As shown in the following figure, points 1, 2 and 3 in Fig. 1 map to the three points in Fig. 2 one by one, and still form a triangle, but the shape has changed a lot. Find an affine transformation from these two sets of three points (points of interest). Then we can apply the affine transformation to all the points in the image and complete the affine transformation of the image.
In OpenCV, the affine transformation matrix is a 2 × The matrix of 3,
2 on the left × 2 Submatrices
A
A
A is a linear transformation matrix, 2 on the right × 1 Submatrix
B
B
B is the shift item:
For any location on the image (x,y), the affine transformation performs the following operations:
It is important to note that for an image, the width direction is x, the height direction is y, and the order of coordinates is consistent with the corresponding subscript of the image pixel. So the position of the origin is not the lower left corner but the upper right corner, and the direction of Y is not upward, but downward.
In affine transformation, all parallel lines in the original image are also parallel in the result image. To create this matrix, we need to find three points from the original image and their positions in the output image. Then cv2.getA neTransform creates a 2x3 matrix that is passed to the function cv2.warpA ne.
Example
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1 Image Read img = cv.imread("./image/image2.jpg") # 2 Affine transformation rows,cols = img.shape[:2] # 2.1 Creating a transformation matrix pts1 = np.float32([[50,50],[200,50],[50,200]]) pts2 = np.float32([[100,100],[200,50],[100,250]]) M = cv.getAffineTransform(pts1,pts2) # 2.2 Complete affine transformation dst = cv.warpAffine(img,M,(cols,rows)) # 3 Image Display fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100) axes[0].imshow(img[:,:,::1]) axes[0].set_title("Original Map") axes[1].imshow(dst[:,:,::1]) axes[1].set_title("Affine Result") plt.show()
1.5 Transmission Transform
Transmittance transformation is the result of angle change. It refers to the transformation that uses the collinear conditions of perspective center, image point and target point to rotate the shadow bearing surface (perspective plane) around the trace line (perspective axis) by the law of perspective rotation, destroying the original projected light beam, and still keeping the projected geometry unchanged on the shadow bearing surface.
It essentially projects an image to a new view plane, and its general transformation formula is:
Where, (u,v) is the original image pixel coordinate, w is 1, (x=x'/z', y=y'/z') is the result of the transmission transformation. The latter is called the perspective transformation matrix, which is generally divided into three parts:
Where: T1 means a linear transformation of the image, T2 means a translation of the image, T3 means a projective transformation of the image, a_{22}a
22
Is generally set to 1.
In opencv, we want to find four points, any three of which are not collinear, then get the transformation matrix T, and then make the transmission transformation. The transformation matrix is found by the function cv.getPerspectiveTransform, and cv.warpPerspective is applied to this 3x3 transformation matrix.
 Example
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1 Read image img = cv.imread("./image/image2.jpg") # 2 Transmission Transform rows,cols = img.shape[:2] # 2.1 Creating a transformation matrix pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]]) pts2 = np.float32([[100,145],[300,100],[80,290],[310,300]]) T = cv.getPerspectiveTransform(pts1,pts2) # 2.2 Transform dst = cv.warpPerspective(img,T,(cols,rows)) # 3 Image Display fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100) axes[0].imshow(img[:,:,::1]) axes[0].set_title("Original Map") axes[1].imshow(dst[:,:,::1]) axes[1].set_title("Posttransmission results") plt.show()
1.6 Image Pyramids
Image pyramids are a kind of multiscale representation of images. They are mostly used for image segmentation. They are an effective but conceptually simple structure to interpret images at multiresolution.
Image pyramids are used for machine vision and image compression. An image's pyramid is a collection of images arranged in pyramid shape that progressively decreases in resolution from the same original image. It is obtained by downsampling step by step until a certain termination condition is reached.
The bottom of the pyramid is a highresolution representation of the image to be processed, while the top is a lowresolution approximation. The higher the level, the smaller the image and the lower the resolution.
 API
cv.pyrUp(img) #Upsampling an image cv.pyrDown(img) #Downsampling images
 Example
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1 Image Read img = cv.imread("./image/image2.jpg") # 2 Sampling images up_img = cv.pyrUp(img) # Upsampling operation img_1 = cv.pyrDown(img) # Downsampling operation # 3 Image Display cv.imshow('enlarge', up_img) cv.imshow('original', img) cv.imshow('shrink', img_1) cv.waitKey(0) cv.destroyAllWindows()
1.7 Summary

Image Scaling:
Zoom in or out of the image cv.resize() 
Image Panning:
After specifying the shift matrix, call cv.warpAffine() to shift the image

Image rotation:
Call cv.getRotation Matrix2D to get the rotation matrix, then call cv.warpAffine() to rotate

Affine transformation:
Calling cv.getAffineTransform creates a transformation matrix, which is then passed to cv.warpAffine() for transformation

Transmission transformation:
Find the transformation matrix through the function cv.getPerspectiveTransform(), and project the transformation from cv.warpPerspective()

Pyramid
Image pyramids are a multiscale representation of images using API s:
cv.pyrUp(): Sample up
cv.pyrDown(): Downsampling
2 Morphological Operation
 Learning Objectives
 Understanding image neighborhoods, connectivity
 Understand the different morphological operations: corrosion, expansion, openclose operations, hat and black cap, and the relationship between them
2.1 Connectivity
In an image, the smallest unit is the pixel, with eight adjacent pixels around each pixel. There are three common adjacency relationships: 4, 8, and D. As shown in the following figure:
Connectivity is an important concept for describing regions and boundaries. The two necessary conditions for two pixels to connect are:

Are two pixels adjacent

Whether the gray values of two pixels meet certain similarity criteria (or are equal)
According to the definition of connectivity, there are four connections, eight connections and m connections.
The two pixels are mconnected, that is, a hybrid of 4connected and Dconnected.
2.2 Morphological Operation
Morphological conversion is a simple operation based on the shape of an image. It is usually performed on a binary image. Corrosion and expansion are two basic morphological operators. Then its variants are open, closed, and hat black.
2.2.1 Corrosion and Expansion
Corrosion and expansion are the most basic morphological operations, both for white (highlighted) parts.
Expansion means expanding the highlighted areas in the image, and the effect image has larger highlighted areas than the original image. Corrosion means that the highlighted areas in the original image are eaten away, and the effect image has smaller highlighted areas than the original image. Expansion is the operation of finding the local maximum value, and corrosion is the operation of finding the local minimum value.

corrosion
The specific operation is to scan every pixel in the image with a structure element, and to "and" each pixel in the structure element and the pixel it covers, if both are 1, the pixel is 1, otherwise 0. As shown in the following figure, structure A is corroded by structure B:
Corrosion eliminates the boundary points of objects, reduces the target size, and eliminates noise points smaller than structural elements.
API:
cv.erode(img,kernel,iterations)
Parameters:
img: Image to process kernel: Nuclear structure iterations: Number of corrosion, default is 1
 expand
The specific operation is to scan every pixel in the image with a structure element, and do an AND operation with each pixel in the structure element and the pixel it covers. If both are 0, the pixel is 0, otherwise it is 1. As shown in the following figure, structure A is corroded by structure B:
The purpose of expansion is to merge all background points that come into contact with an object into the object so that the target enlarges and the holes in the target can be filled.
API:
cv.dilate(img,kernel,iterations)
Parameters:
img: Image to process kernel: Nuclear structure iterations: Number of corrosion, default is 1
Example:
We use a 5*5 convolution core to perform the operations of corrosion and expansion:
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1 Read image img = cv.imread("./image/image3.png") # 2 Create a core structure kernel = np.ones((5, 5), np.uint8) # 3 Image Corrosion and Expansion erosion = cv.erode(img, kernel) # corrosion dilate = cv.dilate(img,kernel) # expand # 4 Image Display fig,axes=plt.subplots(nrows=1,ncols=3,figsize=(10,8),dpi=100) axes[0].imshow(img) axes[0].set_title("Original Map") axes[1].imshow(erosion) axes[1].set_title("Postcorrosion results") axes[2].imshow(dilate) axes[2].set_title("Expansion Result") plt.show()
2.2.2 Open and Close Operations
Open and closed operations treat corrosion and expansion in a certain order. However, the two are not reversible, that is, you can't get the original image by turning it on and off first.
 Open operation
Opening operations corrode before expanding, and their functions are to separate objects and eliminate small areas. Features: Eliminate noise, remove small interference blocks without affecting the original image.
 Closed operation
Closed operation is the opposite of open operation. It expands first and then corrodes. It eliminates the holes in / closes the object. The feature is that the closed area can be filled.
API:
cv.morphologyEx(img, op, kernel)
Parameters:
img: Image to process op: Processing method: If open, set to cv.MORPH_OPEN，If closed, set to cv.MORPH_CLOSE Kernel: Nuclear structure
Example:
The implementation of the openclose operation for convolution using a 10*10 kernel structure.
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1 Read image img1 = cv.imread("./image/image5.png") img2 = cv.imread("./image/image6.png") # 2 Create a core structure kernel = np.ones((10, 10), np.uint8) # 3 Image Opening and Closing cvOpen = cv.morphologyEx(img1,cv.MORPH_OPEN,kernel) # Open operation cvClose = cv.morphologyEx(img2,cv.MORPH_CLOSE,kernel)# Closed operation # 4 Image Display fig,axes=plt.subplots(nrows=2,ncols=2,figsize=(10,8)) axes[0,0].imshow(img1) axes[0,0].set_title("Original Map") axes[0,1].imshow(cvOpen) axes[0,1].set_title("Open operation result") axes[1,0].imshow(img2) axes[1,0].set_title("Original Map") axes[1,1].imshow(cvClose) axes[1,1].set_title("Closed operation result") plt.show()
2.2.3 Top and Black Hats
 Top hat operation
The difference between the original image and the result of the Open operation is calculated as follows:
Because the result of the open operation is an enlarged crack or locally lowbrightness area, subtracting the resulting image from the original image highlights areas that are brighter than those around the original outline, and this operation is dependent on the size of the selected kernel.
Top hat operations are used to separate patches that are lighter than nearby patches. When an image has a large background and small objects are regular, the cap operation can be used for background extraction.
 Black Hat Operation
The difference between the result graph of the Closed Operation and the original image. The mathematical expression is:
The resulting image after the Black Hat operation highlights areas darker than those around the original outline, and this operation is dependent on the size of the selected kernel.
The black cap operation is used to separate patches that are darker than neighboring points.
API:
cv.morphologyEx(img, op, kernel)
Parameters:
img: Image to process op: Processing method: Kernel: Nuclear structure
Example
import numpy as np import cv2 as cv import matplotlib.pyplot as plt # 1 Read image img1 = cv.imread("./image/image5.png") img2 = cv.imread("./image/image6.png") # 2 Create a core structure kernel = np.ones((10, 10), np.uint8) # 3 Image cap and black cap operations cvOpen = cv.morphologyEx(img1,cv.MORPH_TOPHAT,kernel) # Top hat operation cvClose = cv.morphologyEx(img2,cv.MORPH_BLACKHAT,kernel)# Black Hat Operation # 4 Image Display fig,axes=plt.subplots(nrows=2,ncols=2,figsize=(10,8)) axes[0,0].imshow(img1) axes[0,0].set_title("Original Map") axes[0,1].imshow(cvOpen) axes[0,1].set_title("Top hat operation result") axes[1,0].imshow(img2) axes[1,0].set_title("Original Map") axes[1,1].imshow(cvClose) axes[1,1].set_title("Black Hat Operation Result") plt.show()
2.3 Summary

Connectivity contiguity: 4, 8, and D
Connectivity: 4 Connectivity, 8 Connectivity and m Connectivity

Morphological Operation

Corrosion and expansion:
Corrosion: local maximum
Expansion: local minimum 
Openclose operation:
On: Corrosion before expansion
Closed: expanding before corrosion
 Top hat and black cap:
Top hat: difference between original image and open operation
Black Cap: Difference between Closed Operation and Original Image
