Opencv Python tutorial: histogram and its rendering (calcHist)

Original link: http://www.juzicode.com/opencv-python-histogram-calchist-draw-hist

Return to opencv Python tutorial

The histogram of an image reflects the statistical characteristics of image pixel values, such as a CV_8U type image shows the distribution of 256 values from 0 to 255. We can divide the statistical "granularity" into each pixel value. Of course, the statistical interval does not have to be divided into each pixel value. We can also divide 0-255 into wider intervals, such as 0-7, 8-15..... 248-255. Every 8 pixel values are counted as an interval. The concept of "bin", such as a CV, is often encountered in the histogram_ If the bin size of the 8U image is set to 256, the bin width will be 1, which corresponds to the statistics on each pixel value in the previous example. If the bin size is set to 32, the bin width will be 256 / 32 = 8, which will be counted on every 8 pixel values.

The histogram calculated by calcHist() in OpenCV is a matrix (array). Although it is also a two-dimensional image, it can not be directly displayed by imshow(). It needs to be transformed and drawn in line to represent the histogram as an intuitive image. In addition, the histogram can also be drawn with the help of numpy and matplotlib. The latter interface is more concise. Let's take a look at this method later.

1. matplotlib hist() plot histogram

In matplotlib, histograms can be drawn using hist() method. The interface form is as follows:

hist(x, bins=None, range=None,......)
  • Parameter meaning:
  • x: Input sequence. If it is a two-dimensional image, it needs to be expanded into a one-dimensional array;
  • bins: how many columns? If it's CV_8U type image is set to 256, indicating that each pixel value is 1 interval;
  • Range: the threshold range of pixel value. If it is not set, it will be calculated automatically;

In fact, the hist() method in matplotlib has more than a dozen input parameters. Here, we only need to use the above parameters to complete the drawing.

The following example reads the lena diagram, and then draws the histogram of its BGR channel respectively. When drawing the histogram, the input parameter x is required to be a one-dimensional array, so the travel () method is used to expand the image:

import numpy as np
import matplotlib.pyplot as plt
import cv2
plt.rc('font',family='Youyuan',size='9')
plt.rc('axes',unicode_minus='False')
print('VX official account: Orange code / juzicode.com')

img_src = cv2.imread('..\\lena.jpg')
b,g,r = cv2.split(img_src)  

#Display image
fig,ax = plt.subplots(2,2)
ax[0,0].set_title('b hist')
ax[0,0].hist(b.ravel(),bins=256)  
ax[0,1].set_title('g hist')
ax[0,1].hist(g.ravel(),bins=256)
ax[1,0].set_title('r hist')
ax[1,0].hist(r.ravel(),bins=256)
ax[1,1].set_title('src') 
ax[1,1].imshow(cv2.cvtColor(img_src,cv2.COLOR_BGR2RGB))  
#ax[0,0].axis('off');ax[0,1].axis('off');ax[1,0].axis('off');
ax[1,1].axis('off')#Turn off axis display
plt.show() 

Operation results:

In the hist() method, bins=256 is set, so the coordinate length in the x direction of the histogram is 256. At this time, the number of pixels of each pixel value will be counted.

The histogram drawn by hist() can be regarded as a special case of the histogram drawn by bar(). In the histogram, the interval between columns is 0, and the coordinates in the x direction are replaced by numbers, which can be used for reference Data visualization ~matplotlib pie chart and histogram.

2. Calculate histogram calcHist

calcHist() can be used to count the histogram of the image. The interface form is:

cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]]) ->hist
  • Parameter meaning:
  • Images: the input image is a collection of images, which can be a list or tuple containing multi-channel color images, or a list or tuple composed of multiple gray-scale images; List or tuple input;
  • channels: determined according to images, indicating which channel number in images to use, and determined according to the form of images; list or tuple input;
  • mask: mask;
  • histSize: the size of the histogram, which is actually the equal division of element value division; list or tuple input;
  • ranges: value range of image elements; list or tuple input;
  • Calculate: if True, it indicates the cumulative number of pixel values when multiple images are calculated;
  • hist: the returned histogram data is a two-dimensional array. The shape of the array is (the number of rows determined by histSize, 1);

The following example calculates histograms for BGR channels of lena diagram:

import numpy as np
import cv2
print('VX official account: Orange code / juzicode.com')
print('cv2.__version__:',cv2.__version__)

img_src = cv2.imread('..\\lena.jpg')
b,g,r = cv2.split(img_src)  
histSize = 256
histRange = (0, histSize)  #When the statistical range is consistent with histSize, all values can be overwritten
b_hist = cv2.calcHist([b], [0], None, [histSize], histRange)
g_hist = cv2.calcHist([g], [0], None, [histSize], histRange)
r_hist = cv2.calcHist([r], [0], None, [histSize], histRange)

print('b_hist.shape:',b_hist.shape)
min_max = cv2.minMaxLoc(b_hist)
print('b_hist.minMaxLoc:',min_max)
print('b_hist.Non-zero number:',cv2.countNonZero(b_hist))
for i,v in enumerate(b_hist):
    print(v,end=' ')
    if (i+1)%16==0:print()

Operation results:

VX official account: Orange code / juzicode.com
cv2.__version__: 4.5.3
b_hist.shape: (256, 1)
b_hist.minMaxLoc: (0.0, 3260.0, (0, 0), (0, 95))
b_hist.Non-zero number: 191
[0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.]
[0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [1.] [0.] [0.] [0.] [1.]
[1.] [0.] [3.] [2.] [6.] [8.] [2.] [12.] [16.] [34.] [42.] [33.] [64.] [74.] [123.] [148.]
[229.] [279.] [333.] [452.] [573.] [740.] [938.] [1137.] [1294.] [1616.] [1779.] [2091.] [2260.] [2464.] [2684.] [2690.]
[2732.] [2793.] [2807.] [2763.] [2782.] [2741.] [2610.] [2649.] [2710.] [2839.] [2981.] [2908.] [3101.] [3091.] [3102.] [3148.]
[3026.] [2967.] [3032.] [2851.] [2872.] [2776.] [2783.] [2818.] [2831.] [2970.] [2929.] [2959.] [3217.] [3209.] [3132.] [3260.]
[3253.] [3117.] [2999.] [2868.] [2785.] [2655.] [2628.] [2558.] [2620.] [2613.] [2614.] [2746.] [2775.] [2751.] [2661.] [2641.]
[2617.] [2591.] [2563.] [2571.] [2601.] [2792.] [2829.] [2862.] [3042.] [3190.] [3250.] [3225.] [3190.] [2933.] [2740.] [2422.]
[2197.] [1949.] [1754.] [1489.] [1302.] [1116.] [1045.] [968.] [848.] [863.] [863.] [883.] [878.] [837.] [848.] [862.]
[846.] [786.] [798.] [801.] [888.] [892.] [868.] [906.] [835.] [858.] [964.] [1018.] [976.] [1019.] [972.] [956.]
[885.] [965.] [948.] [929.] [919.] [821.] [856.] [838.] [777.] [755.] [779.] [741.] [719.] [698.] [618.] [581.]
[619.] [585.] [580.] [583.] [569.] [617.] [584.] [621.] [620.] [625.] [569.] [548.] [460.] [401.] [380.] [359.]
[324.] [267.] [200.] [201.] [134.] [138.] [130.] [125.] [118.] [99.] [115.] [82.] [57.] [58.] [51.] [45.]
[34.] [27.] [27.] [21.] [11.] [6.] [5.] [2.] [6.] [2.] [1.] [0.] [2.] [1.] [1.] [0.]
[0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.]
[0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.] [0.]

From B_ As you can see from the hist.shape attribute, b_hist is a two-dimensional numpy array with 256 rows and x1 columns. The number of rows is equal to histSize=256. By changing the size of histSize, you can see B_ The properties of hist.shape change with histSize:

histSize = 156
b_hist = cv2.calcHist([b], [0], None, [histSize], histRange)
print('b_hist.shape:',b_hist.shape)

-----Operation results:
b_hist.shape: (156, 1)

In addition to the previous example, images passes in a list composed of a single channel image and channels fixedly passes in [0], images can also use a single multi-channel image, and the channels input parameter corresponds to its channel number:

img_src = cv2.imread('..\\lena.jpg')
#b,g,r = cv2.split(img_src)  
histSize = 256
histRange = (0, histSize)  #When the statistical range is consistent with histSize, all values can be overwritten
b_hist = cv2.calcHist([img_src], [0], None, [histSize], histRange) 
g_hist = cv2.calcHist([img_src], [1], None, [histSize], histRange)
r_hist = cv2.calcHist([img_src], [2], None, [histSize], histRange)

b,g,r = cv2.split(img_src)  
b_hist2 = cv2.calcHist([b], [0], None, [histSize], histRange) 
g_hist2 = cv2.calcHist([g], [0], None, [histSize], histRange)
r_hist2 = cv2.calcHist([r], [0], None, [histSize], histRange) 

print('b_hist Difference:',cv2.countNonZero(cv2.absdiff(b_hist,b_hist2)))
print('g_hist Difference:',cv2.countNonZero(cv2.absdiff(g_hist,g_hist2)))
print('r_hist Difference:',cv2.countNonZero(cv2.absdiff(r_hist,r_hist2)))

Operation results:

b_hist Difference: 0
g_hist Difference: 0
r_hist Difference: 0

From the running results, there is no difference between the histograms calculated by the two methods. In this example, the input parameters "[img_src], [0]," correspond to img_ Channel 0 of SRC corresponds to img_ b channel of SRC.

In addition to a single multi-channel color image, the images input parameter can also contain multiple multi-channel color images. At this time, the channels input parameter will be more complex. The channel number of the later image needs to be superimposed according to the channel number of the previous image, such as passing in a 3-channel img_ SRC1 and a 3-channel img_ Src2: images=[img_src1,img_src2], IMG is calculated_ The channels of SRC1 are still taken as [0], [1] and [2], img_ Src2 channels need to be superimposed on the value of the previous image channel, and the values are [3], [4] and [5] respectively. The following example is used for verification. Two identical 3-channel images are imported at the same time. At this time, the histograms of 3, 4 and 5 channels should be equal to the histograms of 0, 1 and 2 channels:

import numpy as np 
import cv2
print('VX official account: Orange code / juzicode.com')
print('cv2.__version__:',cv2.__version__) 

img_src = cv2.imread('..\\lena.jpg') 
histSize = 256
histRange = (0, histSize)  #When the statistical range is consistent with histSize, all values can be overwritten
b_hist = cv2.calcHist([img_src,img_src], [0], None, [histSize], histRange) 
g_hist = cv2.calcHist([img_src,img_src], [1], None, (histSize,), histRange)
r_hist = cv2.calcHist((img_src,img_src), [2], None, [histSize], histRange) 
#The next 3, 4 and 5 channel numbers correspond to the histogram of the second input picture
b_hist2 = cv2.calcHist((img_src,img_src), [3], None, [histSize], histRange) 
g_hist2 = cv2.calcHist((img_src,img_src), [4], None, [histSize], histRange)
r_hist2 = cv2.calcHist((img_src,img_src), [5], None, [histSize], histRange) 

print('b_hist Difference:',cv2.countNonZero(cv2.absdiff(b_hist,b_hist2)))
print('g_hist Difference:',cv2.countNonZero(cv2.absdiff(g_hist,g_hist2)))
print('r_hist Difference:',cv2.countNonZero(cv2.absdiff(r_hist,r_hist2)))

Operation results:

b_hist Difference: 0
g_hist Difference: 0
r_hist Difference: 0

And so on, there can be many other incoming methods:

b,g,r = cv2.split(img_src)
histSize = 256
histRange = (0, histSize)  #When the statistical range is consistent with histSize, all values can be overwritten
b_hist = cv2.calcHist([img_src,b,g,r], [0], None, [histSize], histRange) 
g_hist = cv2.calcHist([img_src,b,g,r], [1], None, (histSize,), histRange)
r_hist = cv2.calcHist((img_src,b,g,r), [2], None, [histSize], histRange) 
#The next 3, 4 and 5 channel numbers correspond to the histogram of the second input picture
b_hist2 = cv2.calcHist((img_src,b,g,r), [3], None, [histSize], histRange) 
g_hist2 = cv2.calcHist((img_src,b,g,r), [4], None, [histSize], histRange)
r_hist2 = cv2.calcHist((img_src,b,g,r), [5], None, [histSize], histRange) 

3. calcHist() calculates matplotlib plot() display

In the previous introduction, matplotlib hist() method is used to directly display the histogram. Here, calHist() is used to calculate the histogram to obtain an array. The subscript of the array indicates that the pixel value represents the x-axis, and the value of the array element indicates that the number of pixel values corresponding to the subscript represents the y-axis. Therefore, matplotlib's plot() method can also be used to draw the histogram:

import numpy as np
import matplotlib.pyplot as plt
import cv2
print('VX official account: Orange code / juzicode.com')
print('cv2.__version__:',cv2.__version__)
plt.rc('font',family='Youyuan',size='9')
plt.rc('axes',unicode_minus='False')

img_src = cv2.imread('..\\lena.jpg')
b,g,r = cv2.split(img_src)  
histSize = 256
histRange = (0, histSize) Scope and of Statistics histSize When consistent, all values can be overwritten
b_hist = cv2.calcHist([b], [0], None, [histSize], histRange) 
g_hist = cv2.calcHist([g], [0], None, [histSize], histRange) 
r_hist = cv2.calcHist([r], [0], None, [histSize], histRange) 

#Display image
fig,ax = plt.subplots(2,2)
ax[0,0].set_title('b hist')
ax[0,0].plot(b_hist) 
ax[0,1].set_title('g hist')
ax[0,1].plot(g_hist)
ax[1,0].set_title('r hist')
ax[1,0].plot(r_hist)
ax[1,1].set_title('src') 
ax[1,1].imshow(cv2.cvtColor(img_src,cv2.COLOR_BGR2RGB))  
#ax[0,0].axis('off');ax[0,1].axis('off');ax[1,0].axis('off');
ax[1,1].axis('off')#Turn off axis display
plt.show() 

Operation results:

The histogram drawn by this method is the same as the curve drawn by hist() method of matplotlib.

4. OpenCV plot display histogram

The histogram calculated by calcHist() is nominally a "graph", but it cannot be directly displayed by imshow() of OpenCV. It can only be displayed after conversion. The histogram is a two-dimensional array of histSize rows and x1 columns, and its second dimension is a numpy array containing only one element, such as B_ Value of the 55th element of hist:

print('b_hist[55]:',b_hist[55])
print('int(b_hist[55]):',int(b_hist[55]))
-----Operation results:
b_hist[55]: [122.07056]
int(b_hist[55]): 122

Here, after rounding with int(), the numpy array is directly converted to int. Thus, the array subscript 55 represents the value of its x-axis, and the rounded 122 represents the value of its y-axis.

The following example draws the histogram of BGR channel in lena diagram. After calculating the histogram of BGR channel with calcHist(), create a hist_ img_ w,hist_ img_ The numpy array with the size of H = 512350 is used to store the visualized histogram image img_hist. BGR channel histogram data values are normalized to img_ The height of hist to avoid drawing beyond the image boundary. Then, take the width of histSize as the cycle boundary, and draw hist each time with the line() method_ img_ W / histSize lines of width:

import numpy as np
import cv2 
print('VX official account: Orange code / juzicode.com')
print('cv2.__version__:',cv2.__version__)

img_src = cv2.imread('..\\lena.jpg') 
histSize = 256
histRange = (0, histSize)  
b_hist = cv2.calcHist([img_src], [0], None, [histSize], histRange) 
g_hist = cv2.calcHist([img_src], [1], None, [histSize], histRange) 
r_hist = cv2.calcHist([img_src], [2], None, [histSize], histRange) 
#Create a histogram empty image
hist_img_w,hist_img_h = 512,350  
img_hist = np.zeros((hist_img_h, hist_img_w, 3), dtype=np.uint8)
#Normalized to 0 and histogram display height
cv2.normalize(b_hist, b_hist, alpha=0, beta=hist_img_h, norm_type=cv2.NORM_MINMAX)
cv2.normalize(g_hist, g_hist, alpha=0, beta=hist_img_h, norm_type=cv2.NORM_MINMAX)
cv2.normalize(r_hist, r_hist, alpha=0, beta=hist_img_h, norm_type=cv2.NORM_MINMAX)
#Drawing, with histSize width as the cycle boundary, draw bin each time_ W widths
bin_w = int(round( hist_img_w/histSize ))
print('bin_w',bin_w)
for i in range(1, histSize):
    cv2.line(img_hist, 
            ( bin_w*(i-1), hist_img_h - int(b_hist[i-1]) ),#Starting point position
            ( bin_w*(i)  , hist_img_h - int(b_hist[i]) ),  #End point position
            ( 255, 0, 0), thickness=2)
    cv2.line(img_hist, 
            ( bin_w*(i-1), hist_img_h - int(g_hist[i-1]) ),
            ( bin_w*(i)  , hist_img_h - int(g_hist[i]) ),
            ( 0, 255, 0), thickness=2)
    cv2.line(img_hist, 
            ( bin_w*(i-1), hist_img_h - int(r_hist[i-1]) ),
            ( bin_w*(i)  , hist_img_h - int(r_hist[i]) ),
            ( 0, 0, 255), thickness=2)
cv2.imshow('img_src', img_src)
cv2.imshow('img_hist', img_hist)
cv2.waitKey()

The histogram drawn is as follows:

5. 2D histogram

The 2D histogram is still calculated by calcHist(), and the input parameter form is similar to the one-dimensional histogram, but slightly different.

From the example of one-dimensional histogram introduced earlier, when calcHist() is used, the channels input parameter has only one element to indicate a channel of the input image, while calculating the 2D histogram needs to indicate two channels, and the image represented by the images parameter must be multiple channels. At the same time, the histSize parameter is increased to 2. histSize[0] corresponds to the histogram size of channels[0], and histSize[1] corresponds to the histogram size of channels[1]. The histRange parameter is increased to 4. histRange[0] and [1] correspond to the value range of channels[0], and histRange[2] and [3] correspond to the value range of channels[1]. The following is an example of calculating the 2D histogram of H and S components in HSV color space of lena image:

import numpy as np
import cv2 
print('VX official account: Orange code / juzicode.com')
print('cv2.__version__:',cv2.__version__)

img_src = cv2.imread('..\\lena.jpg') 
img_hsv = cv2.cvtColor(img_src,cv2.COLOR_BGR2HSV)
img_hist = cv2.calcHist( [img_hsv], [0, 1], None, [180, 256], [0, 180, 0, 256] )
print('img_hist.shape:',img_hist.shape)
#Normalized to 255
minmax=cv2.minMaxLoc(img_hist)
img_hist2 = (255*img_hist/minmax[1]).astype(np.uint8)
#display
cv2.imshow('img_hist', img_hist)
cv2.imshow('img_hist2', img_hist2)
cv2.waitKey()

In this example, channels = [0,1], take the H and s components to calculate the histogram; histSize=[180, 256], indicating that the histSize of the H component is 180 and the histSize of the s component is 256; histRange=[0, 180, 0, 256], histRange of H component is 0 ~ 180, and histRange of S component is 0 ~ 256.

Operation results:

Summary: in the one-dimensional histogram, the x direction represents the value of the pixel value, the y direction represents the value of the pixel value (including the amount of the pixel value), the left side of the x axis represents the number of darker pixels, and the right side of the x axis represents the number of brighter pixels. In addition to the usual brightness (gray level), if the image is converted into HSV color space, it can also be used to represent the histogram of saturation and chroma. The calling form of two-dimensional histogram is similar to that of one-dimensional histogram in terms of input parameters. The two-dimensional histogram can be directly displayed by imshow() method,

Extended reading:

With this method, you won't lose

Python 3.10 is fresh on the shelves. Let's have a match case to taste it

Don't fool me. 0.1 + 0.2 doesn't equal 0.3?

How to implement a "universal" debugging and printing function

On how to turn yourself into a cartoon character

With this artifact, all the ash eating documents will appear

Customize your own two-dimensional code (amzqr) in one line of code

A role exchange experience between orange fungus and supermarket owner uncle Tian

Tags: OpenCV Computer Vision image processing opencv-python

Posted on Thu, 11 Nov 2021 02:12:44 -0500 by rivka