OpenCV Learning Notes: OpenCV Image Processing 1

  • The main contents of this chapter include the following:
    1. Geometric transformation of images
    2. Morphological Conversion of Images
    3. Image Smoothing Method
    4. Histogram method
    5. Edge detection methods
    6. Application of Template Matching and Hoff Transform

1 Geometric transformation

  • Learning Objectives
  1. Master image scaling, translation, rotation, etc.
  2. Understanding affine and transmission transformations of digital images

1.1 Image Scaling

Zooming adjusts the size of an image, even if it is zoomed in or out.

API

cv2.resize(src,dsize,fx=0,fy=0,interpolation=cv2.INTER_LINEAR)


Parameters:

import cv2 as cv
# 1. Read pictures
img1 = cv.imread("./image/dog.jpeg")
# 2. Image Scaling
# 2.1 Absolute size
rows,cols = img1.shape[:2]
res = cv.resize(img1,(2*cols,2*rows),interpolation=cv.INTER_CUBIC)

# 2.2 Relative size
res1 = cv.resize(img1,None,fx=0.5,fy=0.5)

# 3 Image Display
# 3.1 Use opencv to display images (not recommended)
cv.imshow("orignal",img1)
cv.imshow("enlarge",res)
cv.imshow("shrink)",res1)
cv.waitKey(0)

# 3.2 Displaying images using matplotlib
fig,axes=plt.subplots(nrows=1,ncols=3,figsize=(10,8),dpi=100)
axes[0].imshow(res[:,:,::-1])
axes[0].set_title("Absolute scale (magnification)")
axes[1].imshow(img1[:,:,::-1])
axes[1].set_title("Original Map")
axes[2].imshow(res1[:,:,::-1])
axes[2].set_title("Relative scale (reduced)")
plt.show()

1.2 Image Shift

Image panning moves the image to the appropriate location in the specified direction and distance.

  1. API
cv.warpAffine(img,M,dsize)

Parameters:

Example

The requirement is the distance at which the pixel points of the image are moved (50,100):

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1. Read images
img1 = cv.imread("./image/image2.jpg")

# 2. Image Shift
rows,cols = img1.shape[:2]
M = M = np.float32([[1,0,100],[0,1,50]])# translation matrix
dst = cv.warpAffine(img1,M,(cols,rows))

# 3. Image display
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img1[:,:,::-1])
axes[0].set_title("Original Map")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("Post-translational results")
plt.show()

1.3 Image Rotation

Image rotation refers to the process by which an image rotates at an angle in accordance with a certain position while still maintaining its original size. The horizontal symmetric axis, vertical symmetric axis and center coordinate origin of the image may be transformed after the image is rotated, so the coordinates in the image rotation need to be transformed accordingly.
How does the image rotate? As shown in the following figure:

Suppose the image rotates counterclockwise theta θ, The rotation is converted to:

Where:

image-20191023103038145

In the formula above, there are:


It can also be written as:


At the same time, we need to correct the position of the origin, because the coordinate origin in the original image is in the upper left corner of the image, the size of the image will change after rotation, and the origin also needs to be corrected.

Assuming that the center of rotation is the origin of the coordinates when rotating, the origin of the coordinates needs to be moved to the upper left corner of the image after rotating, that is, a transformation needs to be made.


In OpenCV, image rotation first obtains the rotation matrix based on the rotation angle and center, and then transforms it according to the rotation matrix to achieve the rotation effect of any angle and center.

  1. API
cv2.getRotationMatrix2D(center, angle, scale)

Parameters:

center: Center of rotation
angle: Rotation angle
scale: Scale

Return:

M: Rotation Matrix
 call cv.warpAffine Complete image rotation

Example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 Read image
img = cv.imread("./image/image2.jpg")

# 2 Image Rotation
rows,cols = img.shape[:2]
# 2.1 Generating a rotation matrix
M = cv.getRotationMatrix2D((cols/2,rows/2),90,1)
# 2.2 Rotation transformation
dst = cv.warpAffine(img,M,(cols,rows))

# 3 Image Display
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img1[:,:,::-1])
axes[0].set_title("Original Map")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("Result after rotation")
plt.show()

1.4 Affine Transformation

Affine transformation of an image involves the change of the shape, position and angle of the image. It is a common function in deep learning preprocessing. Affine transformation is mainly a combination of scaling, rotation, flipping and translation of the image.

So what is an affine transformation of an image? As shown in the following figure, points 1, 2 and 3 in Fig. 1 map to the three points in Fig. 2 one by one, and still form a triangle, but the shape has changed a lot. Find an affine transformation from these two sets of three points (points of interest). Then we can apply the affine transformation to all the points in the image and complete the affine transformation of the image.

In OpenCV, the affine transformation matrix is a 2 × The matrix of 3,

2 on the left × 2 Submatrices A A A is a linear transformation matrix, 2 on the right × 1 Submatrix B B B is the shift item:

For any location on the image (x,y), the affine transformation performs the following operations:

It is important to note that for an image, the width direction is x, the height direction is y, and the order of coordinates is consistent with the corresponding subscript of the image pixel. So the position of the origin is not the lower left corner but the upper right corner, and the direction of Y is not upward, but downward.

In affine transformation, all parallel lines in the original image are also parallel in the result image. To create this matrix, we need to find three points from the original image and their positions in the output image. Then cv2.getA neTransform creates a 2x3 matrix that is passed to the function cv2.warpA ne.

Example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 Image Read
img = cv.imread("./image/image2.jpg")

# 2 Affine transformation
rows,cols = img.shape[:2]
# 2.1 Creating a transformation matrix
pts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[100,100],[200,50],[100,250]])
M = cv.getAffineTransform(pts1,pts2)
# 2.2 Complete affine transformation
dst = cv.warpAffine(img,M,(cols,rows))

# 3 Image Display
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img[:,:,::-1])
axes[0].set_title("Original Map")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("Affine Result")
plt.show()

1.5 Transmission Transform

Transmittance transformation is the result of angle change. It refers to the transformation that uses the collinear conditions of perspective center, image point and target point to rotate the shadow bearing surface (perspective plane) around the trace line (perspective axis) by the law of perspective rotation, destroying the original projected light beam, and still keeping the projected geometry unchanged on the shadow bearing surface.

It essentially projects an image to a new view plane, and its general transformation formula is:

​​
Where, (u,v) is the original image pixel coordinate, w is 1, (x=x'/z', y=y'/z') is the result of the transmission transformation. The latter is called the perspective transformation matrix, which is generally divided into three parts:

Where: T1 means a linear transformation of the image, T2 means a translation of the image, T3 means a projective transformation of the image, a_{22}a
​22
Is generally set to 1.

In opencv, we want to find four points, any three of which are not collinear, then get the transformation matrix T, and then make the transmission transformation. The transformation matrix is found by the function cv.getPerspectiveTransform, and cv.warpPerspective is applied to this 3x3 transformation matrix.

  1. Example
import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 Read image
img = cv.imread("./image/image2.jpg")
# 2 Transmission Transform
rows,cols = img.shape[:2]
# 2.1 Creating a transformation matrix
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[100,145],[300,100],[80,290],[310,300]])

T = cv.getPerspectiveTransform(pts1,pts2)
# 2.2 Transform
dst = cv.warpPerspective(img,T,(cols,rows))

# 3 Image Display
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img[:,:,::-1])
axes[0].set_title("Original Map")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("Post-transmission results")
plt.show()

1.6 Image Pyramids

Image pyramids are a kind of multi-scale representation of images. They are mostly used for image segmentation. They are an effective but conceptually simple structure to interpret images at multi-resolution.

Image pyramids are used for machine vision and image compression. An image's pyramid is a collection of images arranged in pyramid shape that progressively decreases in resolution from the same original image. It is obtained by downsampling step by step until a certain termination condition is reached.

The bottom of the pyramid is a high-resolution representation of the image to be processed, while the top is a low-resolution approximation. The higher the level, the smaller the image and the lower the resolution.

  1. API
cv.pyrUp(img)       #Upsampling an image
cv.pyrDown(img)        #Downsampling images
  1. Example
import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 Image Read
img = cv.imread("./image/image2.jpg")
# 2 Sampling images
up_img = cv.pyrUp(img)  # Upsampling operation
img_1 = cv.pyrDown(img)  # Downsampling operation
# 3 Image Display
cv.imshow('enlarge', up_img)
cv.imshow('original', img)
cv.imshow('shrink', img_1)
cv.waitKey(0)
cv.destroyAllWindows()

1.7 Summary

  1. Image Scaling:
    Zoom in or out of the image cv.resize()

  2. Image Panning:

    After specifying the shift matrix, call cv.warpAffine() to shift the image

  3. Image rotation:

    Call cv.getRotation Matrix2D to get the rotation matrix, then call cv.warpAffine() to rotate

  4. Affine transformation:

    Calling cv.getAffineTransform creates a transformation matrix, which is then passed to cv.warpAffine() for transformation

  5. Transmission transformation:

    Find the transformation matrix through the function cv.getPerspectiveTransform(), and project the transformation from cv.warpPerspective()

  6. Pyramid

    Image pyramids are a multiscale representation of images using API s:

    cv.pyrUp(): Sample up

    cv.pyrDown(): Downsampling

2 Morphological Operation

  • Learning Objectives
    1. Understanding image neighborhoods, connectivity
    2. Understand the different morphological operations: corrosion, expansion, open-close operations, hat and black cap, and the relationship between them

2.1 Connectivity

In an image, the smallest unit is the pixel, with eight adjacent pixels around each pixel. There are three common adjacency relationships: 4, 8, and D. As shown in the following figure:

Connectivity is an important concept for describing regions and boundaries. The two necessary conditions for two pixels to connect are:

  1. Are two pixels adjacent

  2. Whether the gray values of two pixels meet certain similarity criteria (or are equal)

According to the definition of connectivity, there are four connections, eight connections and m connections.




The two pixels are m-connected, that is, a hybrid of 4-connected and D-connected.

2.2 Morphological Operation

Morphological conversion is a simple operation based on the shape of an image. It is usually performed on a binary image. Corrosion and expansion are two basic morphological operators. Then its variants are open, closed, and hat black.

2.2.1 Corrosion and Expansion

Corrosion and expansion are the most basic morphological operations, both for white (highlighted) parts.

Expansion means expanding the highlighted areas in the image, and the effect image has larger highlighted areas than the original image. Corrosion means that the highlighted areas in the original image are eaten away, and the effect image has smaller highlighted areas than the original image. Expansion is the operation of finding the local maximum value, and corrosion is the operation of finding the local minimum value.

  1. corrosion

    The specific operation is to scan every pixel in the image with a structure element, and to "and" each pixel in the structure element and the pixel it covers, if both are 1, the pixel is 1, otherwise 0. As shown in the following figure, structure A is corroded by structure B:

Corrosion eliminates the boundary points of objects, reduces the target size, and eliminates noise points smaller than structural elements.

API:

   cv.erode(img,kernel,iterations)

Parameters:

img: Image to process
kernel: Nuclear structure
iterations: Number of corrosion, default is 1
  1. expand
    The specific operation is to scan every pixel in the image with a structure element, and do an AND operation with each pixel in the structure element and the pixel it covers. If both are 0, the pixel is 0, otherwise it is 1. As shown in the following figure, structure A is corroded by structure B:

The purpose of expansion is to merge all background points that come into contact with an object into the object so that the target enlarges and the holes in the target can be filled.
API:

   cv.dilate(img,kernel,iterations)

Parameters:

img: Image to process
kernel: Nuclear structure
iterations: Number of corrosion, default is 1

Example:
We use a 5*5 convolution core to perform the operations of corrosion and expansion:

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 Read image
img = cv.imread("./image/image3.png")
# 2 Create a core structure
kernel = np.ones((5, 5), np.uint8)

# 3 Image Corrosion and Expansion
erosion = cv.erode(img, kernel) # corrosion
dilate = cv.dilate(img,kernel) # expand

# 4 Image Display
fig,axes=plt.subplots(nrows=1,ncols=3,figsize=(10,8),dpi=100)
axes[0].imshow(img)
axes[0].set_title("Original Map")
axes[1].imshow(erosion)
axes[1].set_title("Post-corrosion results")
axes[2].imshow(dilate)
axes[2].set_title("Expansion Result")
plt.show()

2.2.2 Open and Close Operations

Open and closed operations treat corrosion and expansion in a certain order. However, the two are not reversible, that is, you can't get the original image by turning it on and off first.

  1. Open operation
    Opening operations corrode before expanding, and their functions are to separate objects and eliminate small areas. Features: Eliminate noise, remove small interference blocks without affecting the original image.
  2. Closed operation
    Closed operation is the opposite of open operation. It expands first and then corrodes. It eliminates the holes in / closes the object. The feature is that the closed area can be filled.
    API:
cv.morphologyEx(img, op, kernel)

Parameters:

img: Image to process
op: Processing method: If open, set to cv.MORPH_OPEN,If closed, set to cv.MORPH_CLOSE
Kernel:  Nuclear structure

Example:

The implementation of the open-close operation for convolution using a 10*10 kernel structure.

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 Read image
img1 = cv.imread("./image/image5.png")
img2 = cv.imread("./image/image6.png")
# 2 Create a core structure
kernel = np.ones((10, 10), np.uint8)
# 3 Image Opening and Closing
cvOpen = cv.morphologyEx(img1,cv.MORPH_OPEN,kernel) # Open operation
cvClose = cv.morphologyEx(img2,cv.MORPH_CLOSE,kernel)# Closed operation
# 4 Image Display
fig,axes=plt.subplots(nrows=2,ncols=2,figsize=(10,8))
axes[0,0].imshow(img1)
axes[0,0].set_title("Original Map")
axes[0,1].imshow(cvOpen)
axes[0,1].set_title("Open operation result")
axes[1,0].imshow(img2)
axes[1,0].set_title("Original Map")
axes[1,1].imshow(cvClose)
axes[1,1].set_title("Closed operation result")
plt.show()

2.2.3 Top and Black Hats

  1. Top hat operation
    The difference between the original image and the result of the Open operation is calculated as follows:

    Because the result of the open operation is an enlarged crack or locally low-brightness area, subtracting the resulting image from the original image highlights areas that are brighter than those around the original outline, and this operation is dependent on the size of the selected kernel.

Top hat operations are used to separate patches that are lighter than nearby patches. When an image has a large background and small objects are regular, the cap operation can be used for background extraction.

  1. Black Hat Operation
    The difference between the result graph of the Closed Operation and the original image. The mathematical expression is:
    The resulting image after the Black Hat operation highlights areas darker than those around the original outline, and this operation is dependent on the size of the selected kernel.
    The black cap operation is used to separate patches that are darker than neighboring points.

API:

cv.morphologyEx(img, op, kernel)

Parameters:

img: Image to process
op: Processing method:
Kernel:  Nuclear structure

Example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 Read image
img1 = cv.imread("./image/image5.png")
img2 = cv.imread("./image/image6.png")
# 2 Create a core structure
kernel = np.ones((10, 10), np.uint8)
# 3 Image cap and black cap operations
cvOpen = cv.morphologyEx(img1,cv.MORPH_TOPHAT,kernel) # Top hat operation
cvClose = cv.morphologyEx(img2,cv.MORPH_BLACKHAT,kernel)# Black Hat Operation
# 4 Image Display
fig,axes=plt.subplots(nrows=2,ncols=2,figsize=(10,8))
axes[0,0].imshow(img1)
axes[0,0].set_title("Original Map")
axes[0,1].imshow(cvOpen)
axes[0,1].set_title("Top hat operation result")
axes[1,0].imshow(img2)
axes[1,0].set_title("Original Map")
axes[1,1].imshow(cvClose)
axes[1,1].set_title("Black Hat Operation Result")
plt.show()

2.3 Summary

  1. Connectivity contiguity: 4, 8, and D

    Connectivity: 4 Connectivity, 8 Connectivity and m Connectivity

  2. Morphological Operation

    • Corrosion and expansion:
      Corrosion: local maximum
      Expansion: local minimum

    • Open-close operation:
      On: Corrosion before expansion
      Closed: expanding before corrosion

    1. Top hat and black cap:
      Top hat: difference between original image and open operation
      Black Cap: Difference between Closed Operation and Original Image

Tags: Python OpenCV AI Computer Vision

Posted on Tue, 19 Oct 2021 13:06:36 -0400 by luke_barnes