Using machine learning to identify verification codes (from 0 to 1)

Recently, I like to put the test result diagram first. You can see the effect first.

The recognition speed is not very fast, and the code has not been further optimized.

This article mainly talks about the process from making verification code to using machine learning to identify the results.

The idea of using machine learning to identify verification codes is to let the computer learn the differences and relationships between different labels after training with a large amount of data and corresponding labels. Form a huge classifier. At this point, enter a picture into the classifier. The classifier will output the "label" of the picture. The picture recognition process is over.

Now let's officially start this article.

1: Generate verification code:

The way to generate the verification code here is to use the PIL Library of Python. It is already the standard library of image processing on the python platform. PIL is very powerful and the API is very simple and easy to use.

Let's put the code here.

import random,os
from PIL import ImageFont,Image,ImageDraw,ImageFilter
def auth_code():
    size = (140,40)                             #Picture size
    font_list = list("0123456789")             #Verification code range
    c_chars = "  ".join(random.sample(font_list,4))  # 4 + with two spaces in the middle
    print(c_chars)
    img = Image.new("RGB",size,(33,33,34))      #RGB color
    draw = ImageDraw.Draw(img)                  #draw one
    font = ImageFont.truetype("arial.ttf", 23)      #typeface
    draw.text((5,4),c_chars,font=font,fill="white") #Word color
    params = [1 - float(random.randint(1, 2)) / 100,
              0,
              0,
              0,
              1 - float(random.randint(1, 10)) / 100,
              float(random.randint(1, 2)) / 500,
              0.001,
              float(random.randint(1, 2)) / 500
              ]
    img = img.transform(size, Image.PERSPECTIVE, params)
    img = img.filter(ImageFilter.EDGE_ENHANCE_MORE)
    img.save(f'./test_img/{c_chars}.png')
    
if __name__ == '__main__':
    if not os.path.exists('./test_img'):
        os.mkdir('./test_img')
    while True:
        auth_code()
        if len(os.listdir('./test_img'))>=3000:     # Just change it to 10000
                                #My computer is too old.
            break

After running, test_img generates the verification code picture as shown in the figure. Here, I directly marked the number corresponding to the verification code with the picture name during generation. Because I don't want to climb the verification code and mark it manually. Very tired!!

The pictures I generated here are still very clean. If you want to make more complex pictures, you can take a look at the blog posted below.

Previously, I wrote an article on using opencv to process the verification code. If you are interested, you can see that the verification code in this article is not handled too much: https://blog.csdn.net/weixin_43582101/article/details/90609399

2: Verification code segmentation

Here is to cut the verification code we generated into four parts and put them into the train according to different labels_ data_ IMG is in different 0-9 folders.

Like this. Make a training set. The segmentation here is also relatively simple. Divide the width directly by 4 ==

import os
from PIL import Image
from sklearn.externals import joblib
import time
def read_img():
    img_array = []
    img_lable = []
    file_list = os.listdir('./test_img')
    for file in file_list:
        try:
            image = file
            img_array.append(image)
        except:
            print(f'{file}:The image is corrupted')
            os.remove('./test_img/'+file)
    return img_array


def sliceImg(img_path, count = 4):
    if not os.path.exists('train_data_img'):
        os.mkdir('train_data_img')
    for i in range(10):
        if not os.path.exists(f'train_data_img/{i}'):
            os.mkdir(f'train_data_img/{i}')
    img = Image.open('./test_img/'+img_path)
    w, h = img.size
    eachWidth = int((w - 17) / count)
    img_path = img_path.replace(' ', '').split('.')[0]

    for i in range(count):
        box = (i * eachWidth, 0, (i + 1) * eachWidth, h)
        img.crop(box).save(f'./train_data_img/{img_path[i]}/'+img_path[i]+ str(time.time()) + ".png")

if __name__ == '__main__':
    img_array = read_img()
    for i in img_array:
        print(i)
        sliceImg(i)

After running, there are corresponding verification code pictures under each folder, and their labels are the first letter of the picture name.

3: Verification code feature extraction

The idea here is: use numpy to put train first_ data_ IMG image is converted into vector. I didn't turn 01. It's troublesome ==

from PIL import Image
import numpy as np
import os
from sklearn.neighbors import KNeighborsClassifier as knn
def img2vec(fname):
    '''Convert picture to vector'''
    im = Image.open(fname).convert('L')
    im = im.resize((30,30))
    tmp = np.array(im)
    vec = tmp.ravel()
    return vec

Then we use our labeled tags to do a feature extraction,

tarin_img_path = 'train_data_img'
def split_data(paths):
    X = []
    y = []
    for i in os.listdir(tarin_img_path):
        path = os.path.join(tarin_img_path, i)
        fn_list = os.listdir(path)
        for name in fn_list:
            y.append(name[0])
            X.append(img2vec(os.path.join(path,name)))
    return X, y                 # x vector y label

Then build a classifier,

def knn_clf(X_train,label):
    '''Build classifier'''
    clf = knn()
    clf.fit(X_train,label)
    return clf

The KNN in sklearn is used here. I transferred it directly. If you want to see the handwritten version, you can see the KNN handwritten numeral recognition written before. https://blog.csdn.net/weixin_43582101/article/details/88772273

After constructing the classifier, you can combine the above to make a recognition model.

def knn_shib(test_img):
    X_train,y_label = split_data(tarin_img_path)
    clf = knn_clf(X_train,y_label)
    result = clf.predict([img2vec(test_img)])
    return result

4: Verification code identification

I forgot to do the test set earlier. This time, I still use the method of generating verification code above to generate some test sets.

import random,time
import os
from PIL import ImageFont,Image,ImageDraw,ImageFilter
def auth_code():
    size = (140,40)                             #Picture size
    font_list = list("0123456789")             #Verification code range
    c_chars = "  ".join(random.sample(font_list,4))  # 4 + with two spaces in the middle
    print(c_chars)
    img = Image.new("RGB",size,(33,33,34))      #RGB color
    draw = ImageDraw.Draw(img)                  #draw one
    font = ImageFont.truetype("arial.ttf", 23)      #typeface
    draw.text((5,4),c_chars,font=font,fill="white") #Word color
    params = [1 - float(random.randint(1, 2)) / 100,
              0,
              0,
              0,
              1 - float(random.randint(1, 10)) / 100,
              float(random.randint(1, 2)) / 500,
              0.001,
              float(random.randint(1, 2)) / 500
              ]
    img = img.transform(size, Image.PERSPECTIVE, params)
    img = img.filter(ImageFilter.EDGE_ENHANCE_MORE)
    random_name = str(time.time())[-7:]
    img.save(f'./test_data_img/{random_name}.png')
if __name__ == '__main__':
    if not os.path.exists('./test_data_img'):
        os.mkdir('./test_data_img')
    while True:
        auth_code()
        if len(os.listdir('./test_data_img'))>=30:
            break

Test set pictures are saved in test_data_img, but the current picture is complete. If we want to identify, we should first cut the picture according to the previous method, divide it into four parts, and then use our model to identify.

from lx3 Verification code feature extraction import *
from lx2 Verification code segmentation     import *

def sliceImg(img_path, count = 4):
    if not os.path.exists('test_split_img'):
        os.mkdir('test_split_img')
    img = Image.open(img_path)
    w, h = img.size
    eachWidth = int((w - 17) / count)
    for i in range(count):
        box = (i * eachWidth, 0, (i + 1) * eachWidth, h)
        img.crop(box).save('./test_split_img/' + f"{i+1}.png")

if __name__ == '__main__':
    test_data_img = r'test_data_img\.059682.png'
    sliceImg(test_data_img)
    result = []
    for img in os.listdir('test_split_img'):
        result.append(knn_shib('test_split_img/'+img)[0])
    print(result)

In fact, this is the end. The code here is mainly based on cases and has not been optimized. Many places need to be improved, and some details have not been handled. Interested students can improve themselves.

Data and code are placed in github. It can be downloaded directly, https://github.com/lixi5338619/OCR_Yanzhengma/tree/master

Posted on Mon, 22 Nov 2021 11:27:16 -0500 by jarow