Teach you how to download your favorite music in bulk using Python


Text and pictures of the text come from the network for learning and communication purposes only. They do not have any commercial use. Copyright is owned by the original author. If you have any questions, please contact us in time for processing.

PS: If you need Python learning materials for your child, click on the link below to get them


Music is the condiment of life. At present, a lot of music can only be played and can not be downloaded.How can we be willing to be technicians?

Knowledge Points:

  1. requests

  2. regular expression


Development environment:

  1. Version: Anaconda 5.2.0 (python 3.6.5)

  2. Editor: pycharm


Third-party libraries:

  1. requests

  2. parsel

Web Page Analysis

Target site: http://music.taihe.com/search?key=%E9%99%88%E7%B2%92

Analyzing the True Address of Music

Choose a song to Chen Li's Walking horse take as an example

Open the developer's tool, select network -> media -> refresh the web page to get the true address of the music

But the address you get can't be read from the source code. It must be hidden by Baidu Music.There are usually two situations at this time.The first is to stitch or encrypt the request connection using JavaScript, and the second is to hide the data.Because we don't know what happened.So we can only slowly analyze the requested data. After analysis, we can see that the real music address is http://musicapi.taihe.com/v1/restserver/ting? Method=baidu.ting.song.playAAC&format=jsonp&callback=jQuery17205311797578_15449421991&songid=243093242&from=web &=15449421336

And what we're asking this API to return is a json data (python's dictionary data type).We can extract all our data as long as we use the rules of the dictionary.

url stitching for all data

Before we get the real address of the music, we'll analyze the url of the real address in order to get the trick of downloading all the music. A closer look at the url reveals that the following from parameters and _do not affect requests for data even if they do not exist.

And the songid in the latter parameter is actually the unique id of the song, and the from parameter is actually the platform from which the song came.

So when we download music later, we can download all the songs as long as we get the songid of the songs in bulk.


Get singid s in bulk

With the developer tools, you can see the songid's location by looking at the source of the web page. If we analyze the url of a singer's page, you will find that it can be constructed as well.

At this point, the whole page analysis is over.

Achieving results

Complete Code

 1 import re
 2 import requests
 3  4  5 def get_songid():
 6     """Musically Acquired songid"""
 7     url = 'http://music.taihe.com/artist/2517'
 8     response = requests.get(url=url)
 9     html = response.text
10     sids = re.findall(r'href="/song/(\d+)"', html)
11     return sids
12 13 14 def get_music_url(songid):
15     """Get Download Link"""
16     api_url = f'http://musicapi.taihe.com/v1/restserver/ting?method=baidu.ting.song.playAAC&format=jsonp&songid={songid}&from=web'
17     response = requests.get(api_url.format(songid=songid))
18     data = response.json()
19     print(data)
20     try:
21         music_name = data['songinfo']['title']
22         music_url = data['bitrate']['file_link']
23         return music_name, music_url
24     except Exception as e:
25         print(e)
26 27 28 def download_music(music_name, music_url):
29     """Download Music"""
30     response = requests.get(music_url)
31     content = response.content
32     save_file(music_name+'.mp3', content)
33 34 35 def save_file(filename, content):
36     """Save Music"""
37     with open(file=filename, mode="wb") as f:
38         f.write(content)
39 40 41 if __name__ == "__main__":
42     for song_id in get_songid():
43         music_name, music_url = get_music_url(song_id)
44         download_music(music_name, music_url)


Tags: Python network JSON Anaconda

Posted on Sun, 10 Nov 2019 02:42:46 -0500 by neil.johnson