I use Python to realize screenshot recognition, improve work efficiency and punch in and out directly

Hello, I'm Dafei. Today I bring you interesting scripts for Python implementation.

1, Tool preparation

System: win10

Python version: Python 3.8.6

Pycharm version: pycharm 2021.1.2(Professional Edition)

Two, get Baidu Intelligent Cloud token

Baidu AI Cloud   After logging in, find the character recognition - > management interface under the artificial intelligence interface to create an application character recognition.

After creating an application, record the AppID, API key and Secret Key information provided by the background interface

Next, obtain and use the Token according to the official documents

# encoding:utf-8
import requests
# client_id is the AK and client obtained on the official website_ Secret is the SK obtained on the official website
host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=wgEHks0l6MCpalbs3lPuFX1U&client_secret=Z4Rn4ghBx9k06fUYPmSEIRbCFvWFxLyQ'
response = requests.get(host)
if response:
    print(response.json()['access_token'])

3, Baidu interface call

Use the obtained token to call Baidu interface to recognize the picture and extract the text

# encoding:utf-8

import requests
import base64
'''
Universal character recognition (high precision version)
'''
request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic"
# Open picture file in binary mode
f = open('picture.png', 'rb')
img = base64.b64encode(f.read())
params = {"image":img}
# Call of obtained Token
access_token = '24.0d99efe8a0454ffd8d620b632c58cccc.2592000.1639986425.282335-24065278'
request_url = request_url + "?access_token=" + access_token
headers = {'content-type': 'application/x-www-form-urlencoded'}
response = requests.post(request_url, data=params, headers=headers)
if response:
    print (response.json())

The obtained token is data in json format

In this step, we can see that the recognized file is returned in json format, so we need to parse the return value in json format to achieve the effect of fetching text.

4, Build a windowed program for easy use

The third-party class library for window visualization is Tkinter. You can enter pip install tkinter at the terminal to download and install by yourself.

Import tkinter module package to build our visualization window. If the function is screenshot recognition, Chinese and English are separated, and the text is automatically sent to the clipboard after recognition.

from tkinter import *
# create a window
window = Tk()
# Window name
window.title('qcc-tnw')
# Set window size
window.geometry('400x600')
# Window title settings
l=Label(window,text='Baidu API call', bg='green', fg='white', font=('Arial', 12), width=30, height=2)
l.pack()
# Set text receiving box
E1 = Text(window,width='100',height='100')
# Set the operation Button and click Run text to recognize "window window, text indicates Button text, font indicates Button text font, width indicates Button width, height indicates Button height, and command indicates running function"
img_txt = Button(window, text='Character recognition', font=('Arial', 10), width=15, height=1)
# Set the action Button and click split English
cut_en = Button(window, text='English segmentation', font=('Arial', 10), width=15, height=1)
# Set the action Button and click split Chinese
cut_cn = Button(window, text='Chinese segmentation', font=('Arial', 10), width=15, height=1)
# The parameter anchor='nw 'indicates that it is in the north by west direction of the window, that is, the upper left corner
img_txt.pack(anchor='nw')
cut_en.pack(anchor='nw')
cut_cn.pack(anchor='nw')
# Make the built window always appear on the top of the desktop
window.wm_attributes('-topmost',1)
window.mainloop()

5, Realize the automatic saving of screenshots

Through the above analysis of Baidu interface, it is found that the interface does not support extracting files from the clipboard.

Therefore, the pictures intercepted through the PIL library are saved locally from the clipboard, and the text recognition in the pictures is realized by calling Baidu's interface.

Input pip install PIL at the installation terminal of PIL

from PIL import ImageGrab

#Take out the file of the clipboard and save it locally

image = ImageGrab.grabclipboard()
s= 'xxx.png'
image.save(s)
#Baidu interface call
request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic"
f = open(s, 'rb')
img = base64.b64encode(f.read())
params = {"image": img}
access_token = '24.ee0e97cbc00530d449464a563e628b8d.2592000.1640228774.282335-24065278'
request_url = request_url + "?access_token=" + access_token
headers = {'content-type': 'application/x-www-form-urlencoded'}
response = requests.post(request_url, data=params, headers=headers)
for i in response.json()['words_result']:
    print(i['words'])

After completion, you can use the screenshot function of qq or wechat to take screenshots and run the program.

6, Displays the recognized text output in the window text box and sends the text to the clipboard

if response:
    for i in response.json()['words_result']:
        # Accept recognized text
        E1.insert("insert", i['words'] + '\n')
        E1.pack(side=LEFT)
    # Write the recognized text to the clipboard
    pyperclip.copy(E1.get("1.0","end"))

  

7, Extract Chinese (English) from the recognized text

The judgment here is relatively simple. Change '<' to '>' in if len (''. Join (re. Findall (R '[a-za-z]', I ['words'])) < 1: to be Chinese

E1.delete('1.0','end')
for i in response.json()['words_result']:
#Judge whether there is English
    if len(''.join(re.findall(r'[A-Za-z]', i['words'])))<1:
        #The regular filtered text will be displayed in the text box
        E1.insert("insert", i['words'] + '\n')
        E1.pack(side=LEFT)
    #Copy to clipboard
    pyperclip.copy(E1.get("1.0", "end"))

Finally, the method is encapsulated as a function and passed to the window button defined by us  

# Set the operation Button and click Run text to recognize "window window, text indicates Button text, font indicates Button text font, width indicates Button width, height indicates Button height, and command indicates running function"
img_txt = Button(window, text='Character recognition', font=('Arial', 10), width=15, height=1,command=img_all)
# Set the action Button and click split English
cut_en = Button(window, text='English segmentation', font=('Arial', 10), width=15, height=1,command=img_en)
# Set the action Button and click split Chinese
cut_cn = Button(window, text='Chinese segmentation', font=('Arial', 10), width=15, height=1,command=img_cn)
# The parameter anchor='nw 'indicates that it is in the north by west direction of the window, that is, the upper left corner
img_txt.pack(anchor='nw')
cut_en.pack(anchor='nw')
cut_cn.pack(anchor='nw')
window.wm_attributes('-topmost',1)

The above is the details of Python's actual implementation of screenshot recognition. If it is helpful to you, remember to leave comments to Dafei.

Tags: Python Back-end

Posted on Thu, 02 Dec 2021 21:19:09 -0500 by phpfanphp