No other use, mainly to record their own learning process.
1, Data reading
- Picture reading
cv2.imread():Function for reading @param_1 : filename Name of file to be loaded. # File address @param_2 : flags Flag that can take values of cv::ImreadModes # Method of reading file # There are three options cv2.IMREAD_COLOR : Load color pictures. This parameter is default. You can write 1 directly cv2.IMREAD_GRAYSCALE : Loading pictures in grayscale mode, you can write 0 directly cv2.IMREAD_UNCHANGED : include alpha，Can write directly -1
img = cv2.imread('path') # path: to read the address of the picture. The format read in is BGR cv2.imshow('image',img) # param_1: display picture window name, param_2: pictures to be displayed # You need two sentences at the end, otherwise it will be displayed all the time cv2.waitKey(0) # Here 0 is any key, you can also set it to the key you want cv2.destroyAllWindows() # Close all windows # Read a picture and display it in the form of gray image img_gr = cv2.imread('jinnie.jpeg', cv2.IMREAD_GRAYSCALE)
- Video reading
vc = cv2.VideoCapture('file_path') # Enter file address # Check whether it is opened correctly open = vc.isOpened() # Returns a Boolean value while open: # Read the video by frame, open returns bool, frame is the image of each frame, three-dimensional matrix, and the format is BGR ret, frame = vc.read() # If an empty frame is read, it ends if frame is None: break # Open correctly if ret ==True: # Convert each frame image into a grayscale image: cv2.cvtColor(p1,p2) is the color space conversion function, p1 is the image to be converted, and p2 is the format to be converted gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) cv2.imshow('result', gray) # The value in cv2.waitKey(10) here needs to be handled by yourself. Less is double speed, and more is slow if cv2.waitKey(10) & 0xFF == 27: break vc.release() cv2.destroyAllWindows()
- pictures saving
cv.imwrite(path,img) # Write a picture
- Image fusion
IMG = cv2.addWeighted(img1, a1, img2, a2, b) ''' a1 : img1 Weight required a2 : img2 Weight required b : Offset Like new_ph = a1*img1 + a2*img2 +b '''
- Picture display, window adjustment
Method 1 ''' # Window size cannot be changed: cv2.namedWindow("image",cv2.WINDOW_AUTOSIZE) # Window size adaptive scale: cv2.namedWindow("image",cv2.WINDOW_FREERATIO) # Keep window size proportional: cv2.namedWindow("image",cv2.WINDOW_KEEPRATIO) # The display color becomes dark (it seems useless and can't be understood): cv2.namedWindow('image',cv2.WINDOW_GUI_EXPANDED) # Relationship with cv2.imshow cv2.imshow('Window title',image)，If there is no front cv2.namedWindow，Automatically execute one first cv2.namedWindow() ''' cv2.nameWindow('filename', cv2.WINDOW_AUTOSIZE) Method 2: ''' src: input image dsize: The size of the output image. If the parameter is 0, it means that the size after scaling needs to be calculated by formula, dsize = Size(round(fx*src.cols),round(fy*src.rows)). among fx And fy It's an image Width Direction and Height The scale of the direction. fx: Width Scale of the direction, if 0, according to dsize * width/src.cols calculation fy: Height Scale of the direction, if 0, according to dsize * height/src.rows calculation interpolation: The interpolation algorithm type, or interpolation method, is bilinear interpolation by default(5 Two insertion methods) INTER_NEAREST: Nearest neighbor interpolation INTER_LINEAR: Linear interpolation (default) INTER_AREA: Regional interpolation INTER_CUBIC: Cubic spline interpolation INTER_LANCZOS4: Lanczos interpolation ''' cv2.resize(img, None, fx=a, fy=b)
- Picture threshold
- (Explain in detail)
# Threshold (source image, threshold, fill color, threshold type) # The first value returned by the function is the input thresh value, and the second is the processed image ret, dst = cv2.threshold(src, thresh, maxval, type) ''' param_1 : src : Output image, only single channel image can be input, generally gray image param_2 : dst : Output diagram param_3 : thresh : threshold param_4 : maxval : Indicates the threshold type cv2.THRESH_BINARY Pixels that exceed the threshold are set to maxVal，No more than is set to 0 cv2.THRESH_BINARY_INV 1 The inversion of (pixels that do not exceed the threshold are set to maxVal，Exceeded (set to 0) cv2.THRESH_TRUNC Exceed threshold set as threshold cv2.THRESH_TOZERO Below threshold set to 0 cv2.THRESH_TOZERO_INV (Greater than threshold (set to 0) # threshold ret '''
-Mean filtering (simple average convolution operation):
Each pixel of the output image is the average value of the pixels corresponding to the input image in the core window (all pixel weighting coefficients are equal). In fact, it is normalized box filtering
The mean filter itself has inherent defects, that is, it can not protect the image details well. While denoising the image, it also destroys the details of the image, which makes the image blurred and can not remove the noise points well. Especially salt and pepper noise
cv2.blur(Photo name, convolution kernel size) cv2.blur(img, (3,3))
The image is processed by convolution kernel. If regularization is used, the result is the same as that of mean filtering
box = cv2.boxFilter(img2, -1, (3,3), normalize =False)
The value in the convolution kernel of Gaussian blur satisfies Gaussian distribution, which is equivalent to paying more attention to the middle
img_gaussian = cv2.GaussianBlur(img, (Convolution kernel), 1)
Median filter is a nonlinear filter, which is often used to eliminate salt and pepper noise in images. Different from low-pass filtering, median filtering helps to preserve the sharpness of the edge, but it will wash away the texture in the uniform medium area.
-Morphology_ Corrode_ Expansion:
Expansion and corrosion can achieve a variety of functions, mainly as follows:
1. Eliminate noise
2. Separate image elements are segmented, and adjacent elements are joined in the image.
3. Find the obvious maximum or minimum region in the image
4. Find the gradient of the image
Swell( See here for details):
kernel = np.ones((3, 3), np.uint8) erosion = cv2.erode(img, kernel, iterations = 1) # iterations: Times
dilate = cv2.dilate(erosion_xihuan_1, kernel, iterations =1)
Open and close operations:
(Explain in detail)
# On: first corrosion, then expansion kernel = np.ones((3, 3), np.uint8) opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel) # The second parameter can vary a lot # The explanation was very detailed
# Closed: expand first and then corrode kernel = np.ones((3, 3), np.uint8) closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
kernel = np.ones((3, 3), np.uint8) erosion_xihuan_1 = cv2.dilate(img, kernel, iterations = 5) erosion_xihuan_2 = cv2.erode(img, kernel, iterations = 5) error = erosion_xihuan_1-erosion_xihuan_2 res = np.hstack((erosion_xihuan_1,erosion_xihuan_2,error))
Top hat and black hat:
- Top hat = Original - operation calculation ¶
- Black hat = closed operation - Original
#formal hat kernel = np.ones((10,10), np.uint8) # Self defined core tophat = cv2.morphologyEx(img, cv2.MORPH_TOPHAT, kernel) # Black hat blackhat = cv2.morphologyEx(img, cv2.MORPH_BLACKHAT, kernel)
Gradient operation - SOBEL operator - edge detection:
''' Sobel The operator algorithm has the advantages of simple calculation and fast speed, but because only two direction templates are used, Only horizontal and vertical edges can be detected, so this algorithm is suitable for images with complex texture, The algorithm considers that all pixels with new gray value greater than or equal to the threshold are edge points. This judgment is unreasonable, which will lead to misjudgment of edge points, because the gray value of many noise points is also very large. ''' sobelx = cv2.Sobel(img2, cv2.CV_64F, 1,0, ksize = 3) sobely = cv2.Sobel(img2, cv2.CV_64F, 0,1, ksize = 3) # Cv_64fc1 64F represents that each pixel element accounts for 64 bit floating-point numbers, and the number of channels is 1 # Cv_64fc3 64F represents that each pixel element accounts for 64 points × 3 floating-point numbers with 3 channels ''' CV_ - this is just a prefix 64 -Represents double precision 32 -Represents single precision F - floating-point Cx - Number of channels,for example RGB It's three channels '''
- Other operators:
- charr operator - enhanced edge detection:
(Comparison of differences between the two)
The only difference between the two operators is that their convolution kernels are different, and they are the same in both computing time and complexity.
- laplacian operator:
lap = cv2.Laplacian(img, cv2.CV_64F) # Subtracting Laplacian from the original image results in enhanced contrast
- canny edge detection:
(Explain in detail)
v1 = cv2.Canny(img, 80, 150,L2gradient=True) # There are two thresholds ''' 80,150 Is two thresholds Pixels below threshold 1 will be considered as not edges; Pixels higher than threshold 2 will be considered as edges; Pixels between threshold 1 and threshold 2,If it is adjacent to the edge pixel obtained in step 2, it is considered as an edge, otherwise it is not considered as an edge. L2gradient: Whether to use a more accurate gradient calculation method. The default is False '''
(Explain in detail)
There are two common types of image pyramids
- Gaussian pyramid: used for down / down sampling, the main image pyramid
# Up sampling up = cv2.pyrUp(img) # Down sampling down = cv2.pyrDown(img) ''' But it should be noted that, PryUp and PryDown Not reciprocal, i.e PryUp Is not the inverse of downsampling. In this case, the image is first expanded twice in each dimension, and the new rows (even rows) are filled with 0. The specified filter is then convoluted (actually a filter that is expanded twice in each dimension) to estimate the approximate value of "lost" pixels. PryDown( )Is a function that will lose information. In order to restore the original higher resolution image, we need to obtain the information lost by the downsampling operation, which is related to the Laplace pyramid. '''
- Laplacian pyramid: it is used to reconstruct the upper unsampled image from the lower image of the pyramid. In digital image processing, that is, the prediction residual, it can restore the image to the greatest extent, and it can be used together with Gaussian pyramid
- Code implementation( Explain in detail):
import cv2 as cv #Gauss pyramid def pyramid_demo(image): level = 3 #Set the number of layers of the pyramid to 3 temp = image.copy() #Copy image pyramid_images =  #Create an empty list for i in range(level): dst = cv.pyrDown(temp) #Gaussian smoothing is performed on the image first, and then downsampling is performed (reducing the image size by half in row and column directions) pyramid_images.append(dst) #Add a new object at the end of the list cv.imshow("pyramid"+str(i+1), dst) temp = dst.copy() return pyramid_images #laplacian pyramid def lapalian_demo(image): pyramid_images = pyramid_demo(image) #The result of Gauss pyramid must be used to make Laplace pyramid level = len(pyramid_images) for i in range(level-1, -1, -1): if (i-1) < 0: expand = cv.pyrUp(pyramid_images[i], dstsize = image.shape[:2]) lpls = cv.subtract(image, expand) cv.imshow("lapalian_down_"+str(i+1), lpls) else: expand = cv.pyrUp(pyramid_images[i], dstsize = pyramid_images[i-1].shape[:2]) lpls = cv.subtract(pyramid_images[i-1], expand) cv.imshow("lapalian_down_"+str(i+1), lpls) src = cv.imread('F:/test.jpg') cv.namedWindow('input_image') #Set to WINDOW_NORMAL, you can zoom at will cv.imshow('input_image', src) lapalian_demo(src) cv.waitKey(0) cv.destroyAllWindows()
- Gaussian pyramid: used for down / down sampling, the main image pyramid
''' cv2.findContours()The function returns two values, one is the contour itself, and the other is the attribute corresponding to each contour. ''' ret,thresh = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY) countours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE) draw_img = img.copy() res = cv2.drawContours(draw_img, countours,-1, (142, 95, 43),2) # Take out the contour cnt = countours # Calculated area cv2.contourArea(cnt) # Calculated perimeter # True means closed cv2.arcLength(cnt, True)
- Contour approximation:
''' drawContours,The second parameter is the contour itself Python Is a list;So use[cnt] ''' # Convert picture to grayscale gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Obtain image threshold and output image (binary form) ret, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY) # Get the attributes of the contour itself and each contour contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE) cnt = contours draw_img = img.copy() # Draw the outline, 2 is the thickness of the line res = cv2.drawContours(draw_img, [cnt], -1, (0,0,255),2)
- Draw a bounding rectangle:
# The first step is to convert the image to gray and find out the contour gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY) contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE) cnt = contours # Step 2: draw a rectangle x, y, w, h = cv2.boundingRect(cnt) img = cv2.rectangle(img,(x,y), (x+w,y+h), (0,255,0),2) area = cv2.contourArea(cnt) rect_area = w*h extent = float(area) / rect_area # Ratio of contour area to boundary rectangle
# import picture img = cv2.imread('./notebook/2.7/lena.jpg',0) # Import template template = cv2.imread('./notebook/2.7/face.jpg',0) h,w = template.shape[:2]
# cv2.matchTemplate() template matching function. The parameters are: original graph, template graph and loss function res = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED) # Get the maximum value, minimum value, and their index, which is convenient for drawing the rectangle later min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
methods = ['cv2.TM_CCOEFF','cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR','cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF','cv2.TM_SQDIFF_NORMED'] for meth in methods: img2 = img.copy() # True value of matching method # The eval() function executes a string expression and returns the value of the expression. method = eval(meth) res = cv2.matchTemplate(img2, template, method) min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res) # If it is square difference matching TM_SQDIFF or normalized square difference matching TM_SQDIFF_NORMAL, take the minimum value if method in [cv2.TM_SQDIFF,cv2.TM_SQDIFF_NORMED]: top_left = min_loc else: top_left = max_loc bottom_right = (top_left +w, top_left +h) # Draw rectangle cv2.rectangle(img2, top_left, bottom_right, 255,2) plt.subplot(121) plt.imshow(res, cmap='gray') plt.xticks(),plt.yticks()# Hide axes plt.subplot(122) plt.imshow(img2, cmap='gray') plt.xticks(),plt.yticks()# Hide axes plt.suptitle(meth) plt.show()
- Match multiple objects:
img_rgb = cv2.imread('notebook/8/mario.jpg') img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY) template = cv2.imread('notebook/8/mario_coin.jpg', 0) h,w = template.shape[:2] res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED) threshold = 0.8 # Take coordinates with matching degree greater than 80% loc = np.where(res >= threshold) for pt in zip(*loc[::-1]): bottom_right = (pt+w, pt+h) cv2.rectangle(img_rgb, pt, bottom_right, (0,0,255),2) cv_show(img_rgb, 'img')