[machine learning] KNN algorithm for handwriting recognition

[machine learning] KNN algorithm for handwriting recognition

1. Preface

In the last blog, iris data set classification is realized by KNN algorithm. At the end of the blog, whether KNN algorithm is suitable for image classification is discussed. This blog realizes handwriting recognition of handwritten tablet through KNN algorithm, and verifies whether KNN algorithm is suitable for image classification through human-computer interaction mode.

2. Experimental background

In the process of making handwritten dataset, it often appears that the training data and test data are from the handwriting of the same person. At this time, KNN model is friendly to the test data, but this is not in line with the actual situation. Therefore, the handwriting of different people is used as the training set in this experiment. In order to increase the human-computer interaction, we realized a simple tablet to make the test set.

Dataset typeData set sourceNumber of data sets
Training setCollect handwriting from different groups and convert it into txt file format of 01 matrix1974
Test setThe handwriting tablet is implemented in python, and its handwriting is predicted as a test set1

3. Test process

Test steps:
1. Design tablet
2. Make test data
3. Load training data
4. adopt KNN Algorithm training
5. Get the prediction results

3.1 preparation of tablet and test data

Basic idea: write content through the tablet and save it as a picture (512 x 512). Then convert the picture into 01 matrix (32 x 32) and save it as txt file format as test data.

# Tablet, implemented by opencv
def draw(event,x,y,flags,param):
    global ix,iy
    if event==cv2.EVENT_LBUTTONDOWN:
        drawing=True
        ix,iy=x,y
    elif event==cv2.EVENT_MOUSEMOVE:
        if drawing==True:
            cv2.circle(img,(x,y),30,(0,0,0),-1)
    elif event==cv2.EVENT_LBUTTONUP:
        drawing=False

if __name__ == "__main__":

    # Make a picture test set through a tablet
	img=np.zeros((512,512,3),np.uint8)
	for i in range(512):
	    img[i,:]=255
	cv2.namedWindow('image')
	cv2.setMouseCallback('image',draw)
	while(1):
	    cv2.imshow('image',img)
	    if cv2.waitKey(1) & 0xFF == ord(' '):
	        cv2.imwrite('1.jpg',img)
	        break
	cv2.destroyAllWindows()

    # Convert the picture into 0,1 matrix and record it in txt file
	img1 = cv2.imread('1.jpg', cv2.IMREAD_GRAYSCALE)
	res=cv2.resize(img1,(32,32),interpolation=cv2.INTER_CUBIC)
	pic=[]
	for i in range(32):
	    for j in range(32):
	        if res[i][j]<=200:
	            res[i][j]=1
	        else:
	            res[i][j]=0
	        pic.append(int(res[i][j]))

	filename = 'out.txt'
	with open(filename, 'w') as name:
		for i in range(32*32):
			name.write(str(pic[i]))
			if (i+1) % 32 == 0:
				name.write("\n")


3.2 load training data and build KNN model

The KNN algorithm here is the same as the previous iris classification algorithm. Firstly, the test data are transformed according to the training data dimension and sorted by European distance (iris classification is by calculating the distance between each data; handwriting recognition is by calculating the distance between each picture), Finally, K recent data are selected and the tags with the highest frequency are selected as the prediction results.

There are two differences in data loading methods. The first is that the dimension information is richer. It is necessary to convert the (32,32) dimension to (11024) dimension to facilitate the calculation of Euclidean distance; The second is that the label needs to be set by itself. The common file name classification method is adopted here.

# The image steering amount is changed from (32,32) to (11024) in order to realize the distance calculation of KNN algorithm
def img2vector(filename):
    vector = np.zeros((1, 1024))
    file = open(filename)
    for i in range(32):
        str = file.readline()
        for j in range(32):
            vector[0, 32*i+j] = int(str[j])
    return vector
 
def KNN(Test, Train, labels, k):
    dataSetSize = Train.shape[0]

    # Find the Euclidean distance of pixels
    distance = np.tile(Test, (dataSetSize, 1)) - Train    
    sqdistance = distance ** 2
    sqdistances = sqdistance.sum(axis=1)
    distances = sqdistances ** 0.5 

    # Get the subscript of the training set
    sortedDistIndicies = distances.argsort()

    # Get K nearest Euclidean label s
    result = []
    for i in range(k):
        voteIlabel = labels[sortedDistIndicies[i]]
        result.append(voteIlabel)
    print(result)
    collection = Counter(result)
    result = collection.most_common(1)

    return result[0][0]
 
def main():
    
    # Loading and processing training sets and labels
    labels = []
    Train_list = listdir('knn/digits/trainingDigits')
    batch = len(Train_list)
    Train = np.zeros((batch, 1024))

    for i in range(batch):
        name = Train_list[i]    # Name gets the name of each file, "0_0.txt"
        filename = name.split('.')[0].split('_')[0]
        labels.append(filename)    # filename indicates the list of tags
        Train[i, :] = img2vector('knn/digits/trainingDigits/%s' % name)

    # What I designed here is to convert the image of the tablet into a 0,1 matrix and record it in txt as a test set
    Test = img2vector("out.txt")
    result = KNN(Test, Train, labels, 3)
    print(result)

3.3 result prediction

Training data:
Link: https://pan.baidu.com/s/1Zh0rYwvovmm4drEOpjLS8A 
Extraction code: mjll 

All codes:

import cv2
import numpy as np
from os import listdir
import operator
from collections import Counter

drawing=False

# The image steering amount is changed from (32,32) to (11024) in order to realize the distance calculation of KNN algorithm
def img2vector(filename):
    vector = np.zeros((1, 1024))
    file = open(filename)
    for i in range(32):
        str = file.readline()
        for j in range(32):
            vector[0, 32*i+j] = int(str[j])
    return vector
 
def KNN(Test, Train, labels, k):
    dataSetSize = Train.shape[0]

    # Find the Euclidean distance of pixels
    distance = np.tile(Test, (dataSetSize, 1)) - Train    
    sqdistance = distance ** 2
    sqdistances = sqdistance.sum(axis=1)
    distances = sqdistances ** 0.5 

    # Get the subscript of the training set
    sortedDistIndicies = distances.argsort()

    # Get K nearest Euclidean label s
    result = []
    for i in range(k):
        voteIlabel = labels[sortedDistIndicies[i]]
        result.append(voteIlabel)
    print(result)
    collection = Counter(result)
    result = collection.most_common(1)

    return result[0][0]
 
def main():
    
    # Loading and processing training sets and labels
    labels = []
    Train_list = listdir('knn/digits/trainingDigits')
    batch = len(Train_list)
    Train = np.zeros((batch, 1024))

    for i in range(batch):
        name = Train_list[i]    # Name gets the name of each file, "0_0.txt"
        filename = name.split('.')[0].split('_')[0]
        labels.append(filename)    # filename indicates the list of tags
        Train[i, :] = img2vector('knn/digits/trainingDigits/%s' % name)

    # What I designed here is to convert the image of the tablet into a 0,1 matrix and record it in txt as a test set
    Test = img2vector("out.txt")
    result = KNN(Test, Train, labels, 3)
    print("Forecast results:", result)

# Tablet, implemented by opencv
def draw(event,x,y,flags,param):
    global ix,iy,drawing
    if event==cv2.EVENT_LBUTTONDOWN:
        drawing=True
        ix,iy=x,y
    elif event==cv2.EVENT_MOUSEMOVE:
        if drawing==True:
            cv2.circle(img,(x,y),30,(0,0,0),-1)
    elif event==cv2.EVENT_LBUTTONUP:
        drawing=False

if __name__ == "__main__":

    # Make a picture test set through a tablet
	img=np.zeros((512,512,3),np.uint8)
	for i in range(512):
	    img[i,:]=255
	cv2.namedWindow('image')
	cv2.setMouseCallback('image',draw)
	while(1):
	    cv2.imshow('image',img)
	    if cv2.waitKey(1) & 0xFF == ord(' '):
	        cv2.imwrite('1.jpg',img)
	        break
	cv2.destroyAllWindows()

    # Convert the picture into 0,1 matrix and record it in txt file
	img1 = cv2.imread('1.jpg', cv2.IMREAD_GRAYSCALE)
	res=cv2.resize(img1,(32,32),interpolation=cv2.INTER_CUBIC)
	pic=[]
	for i in range(32):
	    for j in range(32):
	        if res[i][j]<=200:
	            res[i][j]=1
	        else:
	            res[i][j]=0
	        pic.append(int(res[i][j]))

	filename = 'out.txt'
	with open(filename, 'w') as name:
		for i in range(32*32):
			name.write(str(pic[i]))
			if (i+1) % 32 == 0:
				name.write("\n")

    # Start test
	main()

4. Summary

If you are interested, you can run copy directly. The prediction results are quite impressive. However, compared with the depth learning method, the accuracy is slightly insufficient. This is because the semantic information of pictures is high-level information, which can not be judged by conventional clustering and distance. When doing KNN experiments, we can try to output the distance between the test data and each training data. When the samples are enough, we will find that there are not many samples with the same distance and different labels, which reflects a bug in the KNN algorithm.

It is concluded that KNN algorithm can be tested on simple image data (e.g. gray data), but it is not applicable to RGB images or even higher dimensional data.

Tags: Python OpenCV Algorithm

Posted on Sat, 09 Oct 2021 22:56:59 -0400 by tim_ver