As a Python programmer, with this, you are no longer afraid that you can't grab the "idiom Solitaire red envelope"

Idiom Solitaire is a traditional word game of the Chinese nation. It has a long history and a broad social foundation. It is a folk cultural and entertainment activity suitable for all ages! Generally, we will play this game for interaction at parties. There is an idiom dragon red envelope in QQ. Sometimes we can't take it on because our idiom reserve is not enough.

So have you ever thought about implementing an idiom Solitaire program? Next, I'll use Python to implement an idiom Solitaire applet. Don't talk more nonsense. Let's start~~~~  

Idiom preparation

When it comes to idiom Solitaire, we must first ensure that we have enough idioms. This condition is not satisfied. I don't have idioms. Ah, break up the meeting~

I'm kidding. As a Python coder, there's no problem climbing data. It doesn't matter if there are no idioms. There are ways,
I found a website: , there are a lot of idioms and explanations on this website. I don't talk much nonsense. I climb down for it.

Analysis of crawling ideas:

Through web packet capturing, the following characteristics are analyzed: each request will be sent:{A-Z}_ {page}. html this request, as shown below, is the first page with the initial letter A.


When there is "next page" in the parsing web page, turn the page in a loop, such as from   Page to , when the "next page" cannot be parsed, the next page of the first letter of Pinyin shall be requested , cycle down in turn until you finish climbing.

In this way, there are two levels of circulation, the first level of circulation A-Z, the second level of circulation page number, and then spell it{A-Z}_ {page}. html to request. When there is no next page, it will jump out of the second level cycle and cycle the next Pinyin initials.



The content marked below is the jump link of the specific information of the idiom. If you send a request for the link, you will return the interpretation and other specific information of the idiom. That's what we need. We need to climb down.



The code is as follows:

import requests
from bs4 import BeautifulSoup

class Idiom:
    def __init__(self):
        self.num = 0
        self.url = '{}_{}.html'
        self.url_info = '{}'
        self.headers = {
                        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
                        'Chrome/81.0.4044.43 Safari/537.36',
                        'Referer': ''
        self.all_idiom = {}
        self.pinyin_initials = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S'
                                , 'T', 'W', 'X', 'Y', 'Z']

    def idiom_spider(self):
        Crawl all idioms
        idiom_list = []
        for initial in self.pinyin_initials:
            page = 1
            while True:
                url = self.url.format(initial, page)
                start_html = requests.get(url, headers=self.headers)
                start_html.encoding = 'gb18030'
                soup = BeautifulSoup(start_html.text, "html.parser")
                # Find all div s with class=listw
                listw = soup.find('div', class_='listw')
                a2 = soup.find("div", class_="a2")
                # Find all a Tags
                lista = listw.find_all('a')
                lastpage = a2.find_all('a')
                for p in lista:
                    print("terms", p.text)
                    info_url = self.url_info.format(p["href"])
                    print("infourl", info_url)
                    info_html = requests.get(info_url, headers=self.headers)
                    info_html.encoding = 'gb18030'
                    soup = BeautifulSoup(info_html.text, "html.parser")
                    # Find all td Tags
                    td_list = soup.findAll('td')
                    # Interpretation of Idioms
                    print("meaning:", td_list[5].text)
                    new_idiom = {"idiom": p.text, "paraphrase": td_list[5].text, "first_pinxin":initial}
                if not lastpage or str(lastpage[-1]).find("next page") == -1:  # If there is no hyperlink tag on the next page, break
                page += 1

idiom = Idiom()

  Operation process diagram:

  You can save idioms in the specified format according to your own needs. I saved all idioms in the sqlite3 database, because Python has a built-in sqlite3 database. Therefore, using sqlite3 in Python does not need to install anything and can be used directly. A total of 30880 idioms are crawled, which may not be complete, but it is enough.



Idiom Solitaire program

In fact, you may have thought of the implementation principle. It is to match the beginning and end conditions of the string with the specified idiom according to the idiom just crawled. If the pinyin can be matched, it is the success of solitaire.

Determine whether the Pinyin is the same. There is a third-party library pypinyin in Python, which can be installed using pip install pypinyin. Use the following code to obtain the Pinyin of the specified Chinese character:

from pypinyin import lazy_pinyin

print(lazy_pinyin("Whole dish Engineer"))

The result is: ['quan ',' CAI ',' gong ',' Cheng ',' Shi '], which is a list. Next, I will decompose the idiom Solitaire program:

Judge whether it is an idiom

The logic is very simple, which is to query whether the specified string is in the crawling idiom library. If it is, it is an idiom, and if it is not, it is not:

When it comes to querying the idiom library, the idioms I crawled just now are placed in the sqlite3 database. The following is the Python connection sqlite3 code, which is extracted for ease of use:

import sqlite3

def sqlite_conn():
        conn = sqlite3.connect('meta.db')
        return conn
    except Exception as e:

Then, the code for judging whether it is an idiom is as follows:

def idiom_exist(user_idiom):
    Query whether the specified idiom is in the idiom library
    :param user_idiom: string
    :return: bool
    cursor = sqlite_conn().cursor()
    db_res = cursor.execute("SELECT id, idiom, paraphrase, first_pinxin  from idiom where idiom='{}'".format(user_idiom))
    for idiom in db_res:
        if idiom[1]:
            return True
    return False



Solitaire is also very simple. According to the idiom input by the user, obtain the phonetic initial of the last word of the idiom, query the qualified idioms in the idiom library according to the phonetic initial, judge whether the first word phonetic of these idioms is the same as the last word phonetic input by the user, and return one of the qualified idioms at random:

def solitaire(user_idiom):
    Return idioms and meanings
    :param user_idiom: Idiom entered by user
    :return: Return idioms and meanings
    cursor = sqlite_conn().cursor()
    # If you do not specify the idiom to be connected, you can select one at random and return
    if not user_idiom:
        random_num = random.randint(1, 30880)
        random_idiom = cursor.execute("SELECT id, idiom, paraphrase, first_pinxin  from idiom where id={}".format(random_num))
        for idiom in random_idiom:
            return idiom[1], idiom[2]

    player = lazy_pinyin(user_idiom)[-1][0].upper()  # Get the last Pinyin initial input by the player
    db_idiom = cursor.execute(
        "SELECT id, idiom, paraphrase, first_pinxin  from idiom where first_pinxin='{}'".format(player))
    chioce_idiom = []  # Alternative idioms
    for idiom in db_idiom:
        if lazy_pinyin(user_idiom)[-1] == lazy_pinyin(idiom[1])[0]:
            chioce_idiom.append([idiom[1], idiom[2]])
    if not chioce_idiom:
        return None, None
    return random.choice(chioce_idiom)[0], random.choice(chioce_idiom)[1]

Judge whether the user Solitaire is correct

The logic is to first query whether the user input is an idiom. If so, then judge whether it meets the Solitaire rules. It is relatively simple:

def judge(bot_idiom, user_idiom):
    if lazy_pinyin(user_idiom)[0] == lazy_pinyin(bot_idiom)[-1]:
        return True
    return False

  So far, the core code of idiom Solitaire has been completed. In order to be interactive and friendly, an interactive function has been written. Due to the long length of the code, only screenshots are posted here:


In the interaction function, the sequential hand selection, which can select the first hand or the second hand, and the memory set are added to judge whether the idiom is reused. A mechanism that can succeed 10 times is also added.

The following is a screenshot of the operation:

It can be seen that the program may not be particularly perfect. Shan has well realized the expectation and fought several. Because my idiom reserve is not high, it all ended in failure. Interested students can try it and see if you can do it. Ha ha ha.


Previous articles are always output in the form of knowledge points. I found that writing too much often becomes a summary of knowledge points. I don't seem to be interesting and can't raise my interest in reading. In the future, I try to add some interesting things to my articles to improve my writing ability. Come on!!






Tags: Python Back-end Programmer crawler

Posted on Thu, 18 Nov 2021 04:38:43 -0500 by Brenden Frank