Python crawling King glory full skin line voice

🍅 Write in front

The last time I crawled the all hero voice of the glory of the king, I looked for the resources of all skin voice. After looking for a while, although it was not officially released, it was sorted out by a netizen. I looked at it and it was very comprehensive (even including the voice of the eight God nunnery who did not go out of the experience service).

  • Python crawling King glory all hero lines, voice and corresponding text:

https://blog.csdn.net/qq_44921056/article/details/119673018

If there is infringement in this article, please contact me to delete the article!!!

🍉 Web page analysis

After entering the page, first analyze the web page.

Through analysis, it is found that the voice package is actually placed in the json file.

Take a link, take the one selected in the above dynamic diagram, and check and verify it.

https://aod.cos.tx.xmcdn.com/storages/211c-audiofreehighqps/B8/3D/CKwRIaIE0Xt6AAmdVwDM5vcs.m4a


The voice page appears successfully and can be played.

Target found!!!

Then we'll see how to crawl the json file.

First, let's see what the rules are.


It can be seen that the only parameter we need to change is page.

If you want to crawl all of them, you need to know how many pages there are in total.

A total of 337 voice content, 10 in a group, so a total of 34 groups are required.

🍋 Complete code

# -*- coding: UTF-8 -*-
"""
# @Time: 2021/9/1 23:52
# @Author: distant star
# @CSDN: https://blog.csdn.net/qq_44921056
"""
import os
import json
import requests
import chardet
from tqdm import tqdm
from fake_useragent import UserAgent

# Randomly generated request header
ua = UserAgent(verify_ssl=False, path='D:/Pycharm/fake_useragent.json')

# Create a folder in advance to facilitate the creation of subfolders
path_f = "./King skin voice/"
if not os.path.exists(path_f):
    os.mkdir(path_f)


# Random handover request header
def random_ua():
    headers = {
        "accept-encoding": "gzip",  # gzip compression coding can improve the file transfer rate
        "user-agent": ua.random
    }
    return headers


#  Download voice content
def download(file_name, text, path):  # Download function
    file_path = path + file_name
    with open(file_path, 'wb') as f:
        f.write(text)
        f.close()


# Get web page content and json it
def get_json(page):
    url = 'https://m.ximalaya.com/m-revision/common/album/queryAlbumTrackRecordsByPage?'
    param = {
        'albumId': '41725731',
        'page': '{}'.format(page),
        'pageSize': '10',
        'asc': 'true',
        'countKeys': 'play', 'comment'
        'v': '1630511230862'
    }
    res = requests.get(url=url, headers=random_ua(), params=param)
    res.encoding = chardet.detect(res.content)["encoding"]  # Determine encoding format
    res = res.text

    text_json = json.loads(res)  # Data json
    return text_json


def main():
    print("Start downloading voice content^-^")
    for page in tqdm(range(1, 35)):  # A total of 337 voice content, 10 in a group, so a total of 34 groups are required
        text_json = get_json(page)
        data_s = text_json["data"]["trackDetailInfos"]  # Get a list of stored information

        for i in range(len(data_s)):
            voice_url = data_s[i]["trackInfo"]["playPath"]  # Voice download address
            voice_name = data_s[i]["trackInfo"]["title"] + '.mp3'  # Voice name
            voice = requests.get(url=voice_url, headers=random_ua()).content  # Get voice content
            download(voice_name, voice, path_f)  # Download voice
    print('All voice downloads are complete^-^')


if __name__ == '__main__':
    main()

🍇 Operation results


It takes about 2 minutes here, so only part of it is recorded

🍈 Voice package download

Last time, the king's voice was full of heroes. Many little friends wrote to me privately.

I have prepared the voice for crawling this time. The little partner can get it by himself.

  • Baidu online disk:

Link: https://pan.baidu.com/s/191ugG6P1_T-ENif1q4q-_A
Extraction code: 1dn7

If it helps you, remember to praise it 👍 Oh, it is also the greatest encouragement to the author 🙇‍♂️.
If there are deficiencies, you can find them in the comment area 👇 I will correct it as soon as I see it

Author: distant star
CSDN: https://blog.csdn.net/qq_44921056
This article is only for exchange and learning. It is prohibited to reprint it without the permission of the author, let alone for other purposes. Violators will be prosecuted.

Tags: Python JSON crawler

Posted on Sun, 05 Sep 2021 15:32:35 -0400 by clintonium