Hello, I'm Dafei. Today I bring you interesting scripts for Python implementation.
1, Tool preparation
System: win10
Python version: Python 3.8.6
Pycharm version: pycharm 2021.1.2(Professional Edition)
Two, get Baidu Intelligent Cloud token
Baidu AI Cloud After logging in, find the character recognition - > management interface under the artificial intelligence interface to create an application character recognition.
After creating an application, record the AppID, API key and Secret Key information provided by the background interface
Next, obtain and use the Token according to the official documents
# encoding:utf-8 import requests # client_id is the AK and client obtained on the official website_ Secret is the SK obtained on the official website host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=wgEHks0l6MCpalbs3lPuFX1U&client_secret=Z4Rn4ghBx9k06fUYPmSEIRbCFvWFxLyQ' response = requests.get(host) if response: print(response.json()['access_token'])
3, Baidu interface call
Use the obtained token to call Baidu interface to recognize the picture and extract the text
# encoding:utf-8 import requests import base64 ''' Universal character recognition (high precision version) ''' request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic" # Open picture file in binary mode f = open('picture.png', 'rb') img = base64.b64encode(f.read()) params = {"image":img} # Call of obtained Token access_token = '24.0d99efe8a0454ffd8d620b632c58cccc.2592000.1639986425.282335-24065278' request_url = request_url + "?access_token=" + access_token headers = {'content-type': 'application/x-www-form-urlencoded'} response = requests.post(request_url, data=params, headers=headers) if response: print (response.json())
The obtained token is data in json format
In this step, we can see that the recognized file is returned in json format, so we need to parse the return value in json format to achieve the effect of fetching text.
4, Build a windowed program for easy use
The third-party class library for window visualization is Tkinter. You can enter pip install tkinter at the terminal to download and install by yourself.
Import tkinter module package to build our visualization window. If the function is screenshot recognition, Chinese and English are separated, and the text is automatically sent to the clipboard after recognition.
from tkinter import * # create a window window = Tk() # Window name window.title('qcc-tnw') # Set window size window.geometry('400x600') # Window title settings l=Label(window,text='Baidu API call', bg='green', fg='white', font=('Arial', 12), width=30, height=2) l.pack() # Set text receiving box E1 = Text(window,width='100',height='100') # Set the operation Button and click Run text to recognize "window window, text indicates Button text, font indicates Button text font, width indicates Button width, height indicates Button height, and command indicates running function" img_txt = Button(window, text='Character recognition', font=('Arial', 10), width=15, height=1) # Set the action Button and click split English cut_en = Button(window, text='English segmentation', font=('Arial', 10), width=15, height=1) # Set the action Button and click split Chinese cut_cn = Button(window, text='Chinese segmentation', font=('Arial', 10), width=15, height=1) # The parameter anchor='nw 'indicates that it is in the north by west direction of the window, that is, the upper left corner img_txt.pack(anchor='nw') cut_en.pack(anchor='nw') cut_cn.pack(anchor='nw') # Make the built window always appear on the top of the desktop window.wm_attributes('-topmost',1) window.mainloop()
5, Realize the automatic saving of screenshots
Through the above analysis of Baidu interface, it is found that the interface does not support extracting files from the clipboard.
Therefore, the pictures intercepted through the PIL library are saved locally from the clipboard, and the text recognition in the pictures is realized by calling Baidu's interface.
Input pip install PIL at the installation terminal of PIL
from PIL import ImageGrab #Take out the file of the clipboard and save it locally image = ImageGrab.grabclipboard() s= 'xxx.png' image.save(s) #Baidu interface call request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic" f = open(s, 'rb') img = base64.b64encode(f.read()) params = {"image": img} access_token = '24.ee0e97cbc00530d449464a563e628b8d.2592000.1640228774.282335-24065278' request_url = request_url + "?access_token=" + access_token headers = {'content-type': 'application/x-www-form-urlencoded'} response = requests.post(request_url, data=params, headers=headers) for i in response.json()['words_result']: print(i['words'])
After completion, you can use the screenshot function of qq or wechat to take screenshots and run the program.
6, Displays the recognized text output in the window text box and sends the text to the clipboard
if response: for i in response.json()['words_result']: # Accept recognized text E1.insert("insert", i['words'] + '\n') E1.pack(side=LEFT) # Write the recognized text to the clipboard pyperclip.copy(E1.get("1.0","end"))
7, Extract Chinese (English) from the recognized text
The judgment here is relatively simple. Change '<' to '>' in if len (''. Join (re. Findall (R '[a-za-z]', I ['words'])) < 1: to be Chinese
E1.delete('1.0','end') for i in response.json()['words_result']: #Judge whether there is English if len(''.join(re.findall(r'[A-Za-z]', i['words'])))<1: #The regular filtered text will be displayed in the text box E1.insert("insert", i['words'] + '\n') E1.pack(side=LEFT) #Copy to clipboard pyperclip.copy(E1.get("1.0", "end"))
Finally, the method is encapsulated as a function and passed to the window button defined by us
# Set the operation Button and click Run text to recognize "window window, text indicates Button text, font indicates Button text font, width indicates Button width, height indicates Button height, and command indicates running function" img_txt = Button(window, text='Character recognition', font=('Arial', 10), width=15, height=1,command=img_all) # Set the action Button and click split English cut_en = Button(window, text='English segmentation', font=('Arial', 10), width=15, height=1,command=img_en) # Set the action Button and click split Chinese cut_cn = Button(window, text='Chinese segmentation', font=('Arial', 10), width=15, height=1,command=img_cn) # The parameter anchor='nw 'indicates that it is in the north by west direction of the window, that is, the upper left corner img_txt.pack(anchor='nw') cut_en.pack(anchor='nw') cut_cn.pack(anchor='nw') window.wm_attributes('-topmost',1)
The above is the details of Python's actual implementation of screenshot recognition. If it is helpful to you, remember to leave comments to Dafei.