[machine learning] KNN algorithm for handwriting recognition
1. Preface
In the last blog, iris data set classification is realized by KNN algorithm. At the end of the blog, whether KNN algorithm is suitable for image classification is discussed. This blog realizes handwriting recognition of handwritten tablet through KNN algorithm, and verifies whether KNN algorithm is suitable for image classification through human-computer interaction mode.
2. Experimental background
In the process of making handwritten dataset, it often appears that the training data and test data are from the handwriting of the same person. At this time, KNN model is friendly to the test data, but this is not in line with the actual situation. Therefore, the handwriting of different people is used as the training set in this experiment. In order to increase the human-computer interaction, we realized a simple tablet to make the test set.
Dataset type | Data set source | Number of data sets |
---|---|---|
Training set | Collect handwriting from different groups and convert it into txt file format of 01 matrix | 1974 |
Test set | The handwriting tablet is implemented in python, and its handwriting is predicted as a test set | 1 |
3. Test process
Test steps: 1. Design tablet 2. Make test data 3. Load training data 4. adopt KNN Algorithm training 5. Get the prediction results
3.1 preparation of tablet and test data
Basic idea: write content through the tablet and save it as a picture (512 x 512). Then convert the picture into 01 matrix (32 x 32) and save it as txt file format as test data.
# Tablet, implemented by opencv def draw(event,x,y,flags,param): global ix,iy if event==cv2.EVENT_LBUTTONDOWN: drawing=True ix,iy=x,y elif event==cv2.EVENT_MOUSEMOVE: if drawing==True: cv2.circle(img,(x,y),30,(0,0,0),-1) elif event==cv2.EVENT_LBUTTONUP: drawing=False if __name__ == "__main__": # Make a picture test set through a tablet img=np.zeros((512,512,3),np.uint8) for i in range(512): img[i,:]=255 cv2.namedWindow('image') cv2.setMouseCallback('image',draw) while(1): cv2.imshow('image',img) if cv2.waitKey(1) & 0xFF == ord(' '): cv2.imwrite('1.jpg',img) break cv2.destroyAllWindows() # Convert the picture into 0,1 matrix and record it in txt file img1 = cv2.imread('1.jpg', cv2.IMREAD_GRAYSCALE) res=cv2.resize(img1,(32,32),interpolation=cv2.INTER_CUBIC) pic=[] for i in range(32): for j in range(32): if res[i][j]<=200: res[i][j]=1 else: res[i][j]=0 pic.append(int(res[i][j])) filename = 'out.txt' with open(filename, 'w') as name: for i in range(32*32): name.write(str(pic[i])) if (i+1) % 32 == 0: name.write("\n")
3.2 load training data and build KNN model
The KNN algorithm here is the same as the previous iris classification algorithm. Firstly, the test data are transformed according to the training data dimension and sorted by European distance (iris classification is by calculating the distance between each data; handwriting recognition is by calculating the distance between each picture), Finally, K recent data are selected and the tags with the highest frequency are selected as the prediction results.
There are two differences in data loading methods. The first is that the dimension information is richer. It is necessary to convert the (32,32) dimension to (11024) dimension to facilitate the calculation of Euclidean distance; The second is that the label needs to be set by itself. The common file name classification method is adopted here.
# The image steering amount is changed from (32,32) to (11024) in order to realize the distance calculation of KNN algorithm def img2vector(filename): vector = np.zeros((1, 1024)) file = open(filename) for i in range(32): str = file.readline() for j in range(32): vector[0, 32*i+j] = int(str[j]) return vector def KNN(Test, Train, labels, k): dataSetSize = Train.shape[0] # Find the Euclidean distance of pixels distance = np.tile(Test, (dataSetSize, 1)) - Train sqdistance = distance ** 2 sqdistances = sqdistance.sum(axis=1) distances = sqdistances ** 0.5 # Get the subscript of the training set sortedDistIndicies = distances.argsort() # Get K nearest Euclidean label s result = [] for i in range(k): voteIlabel = labels[sortedDistIndicies[i]] result.append(voteIlabel) print(result) collection = Counter(result) result = collection.most_common(1) return result[0][0] def main(): # Loading and processing training sets and labels labels = [] Train_list = listdir('knn/digits/trainingDigits') batch = len(Train_list) Train = np.zeros((batch, 1024)) for i in range(batch): name = Train_list[i] # Name gets the name of each file, "0_0.txt" filename = name.split('.')[0].split('_')[0] labels.append(filename) # filename indicates the list of tags Train[i, :] = img2vector('knn/digits/trainingDigits/%s' % name) # What I designed here is to convert the image of the tablet into a 0,1 matrix and record it in txt as a test set Test = img2vector("out.txt") result = KNN(Test, Train, labels, 3) print(result)
3.3 result prediction
Training data: Link: https://pan.baidu.com/s/1Zh0rYwvovmm4drEOpjLS8A Extraction code: mjll
All codes:
import cv2 import numpy as np from os import listdir import operator from collections import Counter drawing=False # The image steering amount is changed from (32,32) to (11024) in order to realize the distance calculation of KNN algorithm def img2vector(filename): vector = np.zeros((1, 1024)) file = open(filename) for i in range(32): str = file.readline() for j in range(32): vector[0, 32*i+j] = int(str[j]) return vector def KNN(Test, Train, labels, k): dataSetSize = Train.shape[0] # Find the Euclidean distance of pixels distance = np.tile(Test, (dataSetSize, 1)) - Train sqdistance = distance ** 2 sqdistances = sqdistance.sum(axis=1) distances = sqdistances ** 0.5 # Get the subscript of the training set sortedDistIndicies = distances.argsort() # Get K nearest Euclidean label s result = [] for i in range(k): voteIlabel = labels[sortedDistIndicies[i]] result.append(voteIlabel) print(result) collection = Counter(result) result = collection.most_common(1) return result[0][0] def main(): # Loading and processing training sets and labels labels = [] Train_list = listdir('knn/digits/trainingDigits') batch = len(Train_list) Train = np.zeros((batch, 1024)) for i in range(batch): name = Train_list[i] # Name gets the name of each file, "0_0.txt" filename = name.split('.')[0].split('_')[0] labels.append(filename) # filename indicates the list of tags Train[i, :] = img2vector('knn/digits/trainingDigits/%s' % name) # What I designed here is to convert the image of the tablet into a 0,1 matrix and record it in txt as a test set Test = img2vector("out.txt") result = KNN(Test, Train, labels, 3) print("Forecast results:", result) # Tablet, implemented by opencv def draw(event,x,y,flags,param): global ix,iy,drawing if event==cv2.EVENT_LBUTTONDOWN: drawing=True ix,iy=x,y elif event==cv2.EVENT_MOUSEMOVE: if drawing==True: cv2.circle(img,(x,y),30,(0,0,0),-1) elif event==cv2.EVENT_LBUTTONUP: drawing=False if __name__ == "__main__": # Make a picture test set through a tablet img=np.zeros((512,512,3),np.uint8) for i in range(512): img[i,:]=255 cv2.namedWindow('image') cv2.setMouseCallback('image',draw) while(1): cv2.imshow('image',img) if cv2.waitKey(1) & 0xFF == ord(' '): cv2.imwrite('1.jpg',img) break cv2.destroyAllWindows() # Convert the picture into 0,1 matrix and record it in txt file img1 = cv2.imread('1.jpg', cv2.IMREAD_GRAYSCALE) res=cv2.resize(img1,(32,32),interpolation=cv2.INTER_CUBIC) pic=[] for i in range(32): for j in range(32): if res[i][j]<=200: res[i][j]=1 else: res[i][j]=0 pic.append(int(res[i][j])) filename = 'out.txt' with open(filename, 'w') as name: for i in range(32*32): name.write(str(pic[i])) if (i+1) % 32 == 0: name.write("\n") # Start test main()
4. Summary
If you are interested, you can run copy directly. The prediction results are quite impressive. However, compared with the depth learning method, the accuracy is slightly insufficient. This is because the semantic information of pictures is high-level information, which can not be judged by conventional clustering and distance. When doing KNN experiments, we can try to output the distance between the test data and each training data. When the samples are enough, we will find that there are not many samples with the same distance and different labels, which reflects a bug in the KNN algorithm.
It is concluded that KNN algorithm can be tested on simple image data (e.g. gray data), but it is not applicable to RGB images or even higher dimensional data.