1, Bitmap principle
Note: lena.bmp (806) is used here × 538) as an example:
(1) Bitmap introduction
The computer can display images in bitmap and vector format.
1. Bitmap:
Image is also called dot matrix or raster. It uses small dots called pixels to describe the image. A computer screen is actually a grid containing a large number of pixels. When we enlarge the bitmap, each pixel looks like a mosaic color block.
2. Vector
Use lines and curves to describe graphics. The elements of these graphics are some points, lines, rectangles, polygons, circles and arcs, which are calculated by mathematical formulas.
The simplest difference between bitmap and vector graph is that vector graph can be enlarged infinitely without distortion; Bitmaps cannot.
Software such as Photoshop(PS), which is mainly used to process bitmap, is called image processing software; Software specialized in vector graphics, which we call graphic design software, such as Adobe Illustrator, CorelDRAW, Flash MX, etc.
(2) BMP bitmap file
Common image file formats include BMP, JPG(JPE,JPEG), GIF, etc.
Bmp image file format is the image file storage format adopted by windows, which is supported by all image processing software running in Windows environment. BMP files after Windows 3.0 refer to device independent bitmap (DIB). The default file extension of BMP bitmap file is. BMP. Sometimes it also has. DIB or. RLE as the extension.
(3) BMP file structure
BMP file consists of 4 parts:
- Bitmap file header
- Bitmap information header
- Color table
- Color dot matrix data (bits data)
The 24 bit true color bitmap has no color table, so it has only three parts: 1, 2 and 4.
Look at the image attribute, bit depth. If it is 24, it means that the image is 24 bit true color
The information display picture size is 1.24M, and the bitmap size calculation is 1.24M × 1024 ≈ 1230KB=1300234 bytes, which is the size without file header information.
Open lena.bmp with UltraEdit, and you can see all the data of this file, as shown in the following figure:
1. Bitmap file header
The bitmap file header is divided into 4 parts, 14 bytes in total:
name | Occupied space | content | Actual data |
bfType | 2 bytes | Logo is the word "BM" | BM |
bfSize | 4 bytes | Size of the entire BMP file | 0x000C0036(786486) [the same as the size value in the picture attribute by right clicking] |
bfReserved1/2 | 4 bytes | Reserved words, useless | 0 |
bfOffBits | 4 bytes | Offset number, i.e. bit map file header + bitmap information header + palette size | 0x36(54) |
Note that the data of Windows is read backwards, which is a feature of PC. If a piece of data is 50 1A 25 3C, read it backwards as 3C 25 1A50, that is 0x3C251A50. Therefore, if the data of bfSize is 36 00 0C 00, it actually becomes 0x000C0036, that is, 0xC0036.
2. Bitmap header
Bitmap header 40 bytes in total:
name | Occupied space | content | Actual data |
biSize | 4 bytes | Size of bitmap header, 40 | 0x28(40) |
biWidth | 4 bytes | The width of the bitmap, in pixels | 0x200(512) |
biHeight | 4 bytes | The height of the bitmap, in pixels | 0x200(512) |
biPlanes | 2 bytes | Fixed value 1 | 1 |
biBitCount | 2 bytes | The number of bits per pixel is 1-black and white, 4-16 colors, 8-256 colors, and 24 true colors | 0x18(24) |
biCompression | 4 bytes | Compression mode, BI_RGB(0) is uncompressed | 0 |
biSizeImage | 4 bytes | Bytes occupied by all pixels of bitmap, BI_RGB can be set to 0 | 0x0C |
biXPelsPerMeter | 4 bytes | Horizontal resolution (pixels / M) | 0 |
biYPelsPerMeter | 4 bytes | Vertical resolution (pixels / meter) | 0 |
biClrUsed | 4 bytes | The number of colors used by the bitmap. If it is 0, the number of colors is the biBitCount power of 2 | 0 |
biClrImportant | 4 bytes | The number of important colors. 0 means that all colors are important | 0 |
As a true color bitmap, we are mainly concerned with the two values of biWidth and biHeight, which tell us the size of the image. biSize, biPlanes and biBitCount are fixed. If you want to be lazy, other values can be filled with 0.
3. Color table
-
If the bitmap is 16 bit, 24 bit and 32-bit color, there is no palette in the image file, that is, there is no palette, and the color of the image is given directly in the bitmap data.
-
16 bit images use 2 bytes to save color values. There are two common formats: 5-bit red, 5-bit green, 5-bit blue, 5-bit red, 6-bit green, 5-bit blue, that is, 555 format and 565 format. 555 format only uses 15 bits, and the last bit is reserved and set to 0.
-
The 24 bit image uses 3 bytes to save the color value. Each byte represents a color and is arranged in red, green and blue.
-
The 32-bit image uses 4 bytes to save the color value. Each byte represents a color. In addition to the original red, green and blue, there is also an Alpha channel, that is, transparent color.
-
If the image has a color palette, the bitmap data can be compressed or not compressed as needed. If compression is selected, the BMP image is 16 or 256 colors and compressed by RLE4 or RLE8 compression algorithm.
- 1: Monochrome picture, the palette contains two colors, that is, what we usually call black-and-white pictures
- 4: 16 color chart
- 8: 256 color map, commonly known as gray map
- 16: 64K image generally has no color palette. Every two bytes in the image data represent a pixel, and 5 or 6 bits represent an RGB component
- 24:16M true color map. Generally, there is no palette. Every 3 bytes in the image data represent a pixel, and each byte represents an RGB component
- 32:4G true color. Generally, there is no palette. Every 4 bytes represents a pixel. Compared with 24 bit true color map, it adds a transparency, namely RGBA mode
4. Color dot matrix data
All pixels of the bitmap are arranged from bottom to top, from left to right.
RGB data is also read backwards. The original data is arranged in the order of B, G and R.
2, Image processing
(1) Original drawing
1. Original lena.jpg (806) × 538)
2. Picture information
(2) 16 / 32 bit bitmap comparison
The tool selected for the experiment is Adobe Photoshop 2021, referred to as PS. PS is an efficient and versatile tool. Photoshop has a variety of selection tools. It involves image synthesis, color correction, layer palette, channel use, action palette, path tool, filter and other image processing functions, which can meet the requirements of this experiment. In addition, Photoshop is easy to learn, easy to use and easy to use, so we choose it as one of the experimental tools.
1.32-bit color bitmap
(1) PS, select File - > open, select the picture we need to open,
Open as follows
(2) Save as 32-bit bitmap
-
Select File - > Save as
-
Select BMP format
-
Select as 32 bits
(3) View its information
File size: 1.65M × one thousand and twenty-four × 1024≈1730150B
Does not contain file header information size.
(4) UltraEdit view the picture header file information as follows
2.16 bit bitmap
(1) Storage format 16 bits
(2) Picture information
File size: 846KB × 1024=866304B
Does not contain file header information size.
(3) UltraEdit view the picture header file information as follows
3. Differences
- To the naked eye, there is no difference between 32-bit and 16 bit bitmaps
-
The 32-bit bitmap is as follows
-
The 16 bit bitmap is as follows
-
- However, the storage space occupied by 16 bit bitmap is less than that of 32-bit bitmap, which is nearly half of that of 32-bit bitmap. The 32-bit bitmap is compressed by half and becomes a 16 bit bitmap.
(3) 256 / 16 / monochrome bitmap comparison
The tool selected in the experiment is the drawing tool provided by windows 10. It can save the file as the required color bitmap without installation. It is simple and convenient, so it is selected as the tool to process 256 / 16 / monochrome bitmap.
1.256 color bitmap
(1) Open in drawing mode
(2) Click file - > Save as - > BMP picture (B)
(3) Save 256 bit color bitmap
(4) View its information
Its file size is
424KB×1024=434176B
Does not include header size.
(5) UltraEdit view the picture header file information as follows
2.16 color bitmap
(1) Save as 16 color bitmap
(2) The file information is as follows
Its file size is
212KB×1024=217088B
Does not include header size.
(3) UltraEdit view the picture header file information as follows
3. Monochrome bitmap
(1) Save as monochrome bitmap
(2) The file information is as follows
Its file size is
54.7KB×1024≈56013B
Does not include header size.
(3) UltraEdit view the picture header file information as follows
4. Differences
- The picture information is as follows. It can be seen by the naked eye. The color of the picture is very different
-
256 colors
-
16 colors
-
monochrome
-
- The fewer colors, the smaller the storage space occupied by the picture.
(4) Compression ratio of different picture formats
1. Save the pictures in BMP, JPG, GIF and PNG formats respectively
Because BMP is not compressed, the compression ratio is calculated based on it
Picture format | Picture size | Compression ratio |
BMP | 1.24M | - |
GIF | 193 KB | 84.8% |
JPG | 147 KB | 88.4% |
PNG | 299 KB | 76.5% |
3, Picture processing programming
(1) Singular function decomposition (SDV)
1. Code
import numpy as np import os from PIL import Image import matplotlib.pyplot as plt import matplotlib as mpl from pprint import pprint def restore1(sigma, u, v, K): # Singular value, left eigenvector, right eigenvector m = len(u) n = len(v[0]) a = np.zeros((m, n)) for k in range(K): uk = u[:, k].reshape(m, 1) vk = v[k].reshape(1, n) a += sigma[k] * np.dot(uk, vk) a[a < 0] = 0 a[a > 255] = 255 # a = a.clip(0, 255) return np.rint(a).astype('uint8') def restore2(sigma, u, v, K): # Singular value, left eigenvector, right eigenvector m = len(u) n = len(v[0]) a = np.zeros((m, n)) for k in range(K+1): for i in range(m): a[i] += sigma[k] * u[i][k] * v[k] a[a < 0] = 0 a[a > 255] = 255 return np.rint(a).astype('uint8') if __name__ == "__main__": A = Image.open("C:/Users/86199/Pictures/lena/lena.jpg", 'r') print(A) output_path = r'./SVD_Output' if not os.path.exists(output_path): os.mkdir(output_path) a = np.array(A) print(a.shape) K = 50 u_r, sigma_r, v_r = np.linalg.svd(a[:, :, 0]) u_g, sigma_g, v_g = np.linalg.svd(a[:, :, 1]) u_b, sigma_b, v_b = np.linalg.svd(a[:, :, 2]) plt.figure(figsize=(11, 9), facecolor='w') mpl.rcParams['font.sans-serif'] = ['simHei'] mpl.rcParams['axes.unicode_minus'] = False for k in range(1, K+1): print(k) R = restore1(sigma_r, u_r, v_r, k) G = restore1(sigma_g, u_g, v_g, k) B = restore1(sigma_b, u_b, v_b, k) I = np.stack((R, G, B), axis=2) Image.fromarray(I).save('%s\\svd_%d.png' % (output_path, k)) if k <= 12: plt.subplot(3, 4, k) plt.imshow(I) plt.axis('off') plt.title('Number of singular values:%d' % k) plt.suptitle('SVD Image decomposition', fontsize=20) plt.tight_layout() # plt.subplots_adjust(top=0.9) plt.show()
2. Results
It can be observed that as the singular value decreases, the picture becomes more and more blurred.
(2) The number of coins and cells in the two sample images is detected by image opening and closing operation (corrosion expansion)
1. Coins
- code
import cv2 import numpy as np def stackImages(scale, imgArray): """ Press multiple images into the same window for display :param scale:float Type, output image display percentage, control zoom scale, 0.5=The image resolution is reduced by half :param imgArray:Tuple nested list, image matrix to be arranged :return:Output image """ rows = len(imgArray) cols = len(imgArray[0]) rowsAvailable = isinstance(imgArray[0], list) width = imgArray[0][0].shape[1] height = imgArray[0][0].shape[0] if rowsAvailable: for x in range(0, rows): for y in range(0, cols): if imgArray[x][y].shape[:2] == imgArray[0][0].shape[:2]: imgArray[x][y] = cv2.resize(imgArray[x][y], (0, 0), None, scale, scale) else: imgArray[x][y] = cv2.resize(imgArray[x][y], (imgArray[0][0].shape[1], imgArray[0][0].shape[0]), None, scale, scale) if len(imgArray[x][y].shape) == 2: imgArray[x][y] = cv2.cvtColor(imgArray[x][y], cv2.COLOR_GRAY2BGR) imageBlank = np.zeros((height, width, 3), np.uint8) hor = [imageBlank] * rows hor_con = [imageBlank] * rows for x in range(0, rows): hor[x] = np.hstack(imgArray[x]) ver = np.vstack(hor) else: for x in range(0, rows): if imgArray[x].shape[:2] == imgArray[0].shape[:2]: imgArray[x] = cv2.resize(imgArray[x], (0, 0), None, scale, scale) else: imgArray[x] = cv2.resize(imgArray[x], (imgArray[0].shape[1], imgArray[0].shape[0]), None, scale, scale) if len(imgArray[x].shape) == 2: imgArray[x] = cv2.cvtColor(imgArray[x], cv2.COLOR_GRAY2BGR) hor = np.hstack(imgArray) ver = hor return ver #Read picture src = cv2.imread("C:/Users/86199/Pictures/computer/coin.png") img = src.copy() #Grayscale img_1 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Binarization ret, img_2 = cv2.threshold(img_1, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) #Corrosion (corrosion is mainly to distinguish each coin. If it is too large, it will cause loss, and if it is too low, it will not be able to distinguish. The parameters can be set by themselves to achieve appropriate.) kernel = np.ones((17, 17), int) img_3 = cv2.erode(img_2, kernel, iterations=1) #Expand (expand to the appropriate value so that each white area is a coin.) kernel = np.ones((3, 3), int) img_4 = cv2.dilate(img_3, kernel, iterations=1) #Find the coin center contours, hierarchy = cv2.findContours(img_4, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2:] #Identification coin cv2.drawContours(img, contours, -1, (0, 0, 255), 5) #display picture cv2.putText(img, "count:{}".format(len(contours)), (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(src, "src", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_1, "gray", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_2, "thresh", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_3, "erode", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_4, "dilate", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) imgStack = stackImages(1, ([src, img_1, img_2], [img_3, img_4, img])) cv2.imshow("imgStack", imgStack) cv2.waitKey(0)
- Operation results
2. Cells
- code
import cv2 import numpy as np def stackImages(scale, imgArray): """ Press multiple images into the same window for display :param scale:float Type, output image display percentage, control zoom scale, 0.5=The image resolution is reduced by half :param imgArray:Tuple nested list, image matrix to be arranged :return:Output image """ rows = len(imgArray) cols = len(imgArray[0]) rowsAvailable = isinstance(imgArray[0], list) width = imgArray[0][0].shape[1] height = imgArray[0][0].shape[0] if rowsAvailable: for x in range(0, rows): for y in range(0, cols): if imgArray[x][y].shape[:2] == imgArray[0][0].shape[:2]: imgArray[x][y] = cv2.resize(imgArray[x][y], (0, 0), None, scale, scale) else: imgArray[x][y] = cv2.resize(imgArray[x][y], (imgArray[0][0].shape[1], imgArray[0][0].shape[0]), None, scale, scale) if len(imgArray[x][y].shape) == 2: imgArray[x][y] = cv2.cvtColor(imgArray[x][y], cv2.COLOR_GRAY2BGR) imageBlank = np.zeros((height, width, 3), np.uint8) hor = [imageBlank] * rows hor_con = [imageBlank] * rows for x in range(0, rows): hor[x] = np.hstack(imgArray[x]) ver = np.vstack(hor) else: for x in range(0, rows): if imgArray[x].shape[:2] == imgArray[0].shape[:2]: imgArray[x] = cv2.resize(imgArray[x], (0, 0), None, scale, scale) else: imgArray[x] = cv2.resize(imgArray[x], (imgArray[0].shape[1], imgArray[0].shape[0]), None, scale, scale) if len(imgArray[x].shape) == 2: imgArray[x] = cv2.cvtColor(imgArray[x], cv2.COLOR_GRAY2BGR) hor = np.hstack(imgArray) ver = hor return ver #Read picture src = cv2.imread("C:/Users/86199/Pictures/computer/cell.png") img = src.copy() #Grayscale img_1 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Binarization ret, img_2 = cv2.threshold(img_1, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) #Corrosion (corrosion is mainly to distinguish each coin. If it is too large, it will cause loss, and if it is too low, it will not be able to distinguish. The parameters can be set by themselves to achieve appropriate.) kernel = np.ones((17, 17), int) img_3 = cv2.erode(img_2, kernel, iterations=1) #Expand (expand to the appropriate value so that each white area is a coin.) kernel = np.ones((3, 3), int) img_4 = cv2.dilate(img_3, kernel, iterations=1) #Find the coin center contours, hierarchy = cv2.findContours(img_4, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2:] #Identification coin cv2.drawContours(img, contours, -1, (0, 255,0), 3) #display picture cv2.putText(img, "count:{}".format(len(contours)), (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(src, "src", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_1, "gray", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_2, "thresh", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_3, "erode", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) cv2.putText(img_4, "dilate", (0, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 3) imgStack = stackImages(1, ([src, img_1, img_2], [img_3, img_4, img])) cv2.imshow("imgStack", imgStack) cv2.waitKey(0)
- Operation results
(3) Using image gradient, opening and closing, contour operation, etc., locate and extract the barcode in the picture, and then call the barcode library to obtain the barcode characters.
1 code
import cv2 import numpy as np import imutils from pyzbar import pyzbar def stackImages(scale, imgArray): """ Press multiple images into the same window for display :param scale:float Type, output image display percentage, control zoom scale, 0.5=The image resolution is reduced by half :param imgArray:Tuple nested list, image matrix to be arranged :return:Output image """ rows = len(imgArray) cols = len(imgArray[0]) rowsAvailable = isinstance(imgArray[0], list) width = imgArray[0][0].shape[1] height = imgArray[0][0].shape[0] if rowsAvailable: for x in range(0, rows): for y in range(0, cols): if imgArray[x][y].shape[:2] == imgArray[0][0].shape[:2]: imgArray[x][y] = cv2.resize(imgArray[x][y], (0, 0), None, scale, scale) else: imgArray[x][y] = cv2.resize(imgArray[x][y], (imgArray[0][0].shape[1], imgArray[0][0].shape[0]), None, scale, scale) if len(imgArray[x][y].shape) == 2: imgArray[x][y] = cv2.cvtColor(imgArray[x][y], cv2.COLOR_GRAY2BGR) imageBlank = np.zeros((height, width, 3), np.uint8) hor = [imageBlank] * rows hor_con = [imageBlank] * rows for x in range(0, rows): hor[x] = np.hstack(imgArray[x]) ver = np.vstack(hor) else: for x in range(0, rows): if imgArray[x].shape[:2] == imgArray[0].shape[:2]: imgArray[x] = cv2.resize(imgArray[x], (0, 0), None, scale, scale) else: imgArray[x] = cv2.resize(imgArray[x], (imgArray[0].shape[1], imgArray[0].shape[0]), None, scale, scale) if len(imgArray[x].shape) == 2: imgArray[x] = cv2.cvtColor(imgArray[x], cv2.COLOR_GRAY2BGR) hor = np.hstack(imgArray) ver = hor return ver #Read picture src = cv2.imread("C:/Users/86199/Pictures/computer/tm.png") img = src.copy() #Grayscale img_1 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Gaussian filtering img_2 = cv2.GaussianBlur(img_1, (5, 5), 1) #Sobel operator sobel_x = cv2.Sobel(img_2, cv2.CV_64F, 1, 0, ksize=3) sobel_y = cv2.Sobel(img_2, cv2.CV_64F, 0, 1, ksize=3) sobel_x = cv2.convertScaleAbs(sobel_x) sobel_y = cv2.convertScaleAbs(sobel_y) img_3 = cv2.addWeighted(sobel_x, 0.5, sobel_y, 0.5, 0) #Mean square wave img_4 = cv2.blur(img_3, (5, 5)) #Binarization ret, img_5 = cv2.threshold(img_4, 127, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) #Closed operation kernel = np.ones((18, 18), int) img_6 = cv2.morphologyEx(img_5, cv2.MORPH_CLOSE, kernel) #Open operation kernel = np.ones((100,100), int) img_7 = cv2.morphologyEx(img_6, cv2.MORPH_OPEN, kernel) #Draw barcode area contours = cv2.findContours(img_7, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) contours = imutils.grab_contours(contours) c = sorted(contours, key = cv2.contourArea, reverse = True)[0] rect = cv2.minAreaRect(c) box = cv2.cv.BoxPoints(rect) if imutils.is_cv2() else cv2.boxPoints(rect) box = np.int0(box) cv2.drawContours(img, [box], -1, (0,255,0), 6) #Display picture information cv2.putText(img, "results", (30, 30), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) cv2.putText(img_1, "gray", (40, 40), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) cv2.putText(img_2, "GaussianBlur",(40, 40), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) cv2.putText(img_3, "Sobel", (40, 40), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) cv2.putText(img_4, "blur", (40, 40), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) cv2.putText(img_5, "threshold", (40, 40), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) cv2.putText(img_6, "close", (40, 40), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) cv2.putText(img_7, "open", (40, 40), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (255, 0, 0), 3) #Output barcode barcodes = pyzbar.decode(src) for barcode in barcodes: barcodeData = barcode.data.decode("utf-8") cv2.putText(img, barcodeData, (50, 70), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), 3) #Show all pictures imgStack = stackImages(0.5, ([img_1, img_2,img_3,img_4],[img_5,img_6,img_7,img])) cv2.imshow("imgStack", imgStack) cv2.waitKey(0)
2. Operation results
4, Summary
This experiment understands the principle of bitmap, including the file header, information header and color table of bitmap, as well as the differences between 32 / 16 bitmap and 256 / 16 / monochrome bitmap, as well as the compression ratio of different picture formats. It uses programming to process images.
5, References
Bitmap (bmp) file format analysis
Fundamentals of digital image and machine vision