[zihi notes ා 5] the latest video download method of station B in 2020
Recently, because of the need to create a vibrato and tiktok, I want to find some material for practice and practice. So, where does the learning material come from! Little broken station!! Don't say much, the text begins!
First of all, the video, audio track and video of station B are separated. At ordinary times, we can still hear the sound of video when we turn off the screen. This is the advantage of separation. So, the final result is two files, one is the audio file, the other is the video (silent) file. In this way, it's OK to cut the video. Sometimes it's a little unfriendly to think about what to watch. Of course, I have also read the article of the big man. There are ways to get together. I'll talk about it later.
1, Page analysis
1. Try to locate first, see this website, it's probably encrypted
2. Let's try other methods to find them. Here, we'll try the links with larger data
3. It's still encrypted, but I think it's the right direction! Continue, let's copy the link back to HTML and look for it
4.html is indeed there, and there are many, touched! It may be subpackaged in json. Use the json interpreter to have a look
5. When pasting“ window.playinfo= ”Remove it, and you can see the complete json
OK! Find the video and audio keys, and take a closer look
As you can see, here is a very user-friendly video with a variety of clarity for us to choose on demand. Let's take a look at the "audio" section
Audio part, here! ok! (here, if the url is valid, try it out one by one, or you won't be able to verify it!)
Page analysis is here!
2, Framework thinking
3, Code implementation
# -*- coding: utf-8 -*- """ Created on Tue Jun 23 16:23:36 2020 @author: Administrator """ import requests import re import json u=input('Please enter the video address:') ######Request head, later found that only user—agent,referer Useful####### headers={ # 'Host': 'upos-sz-mirrorhw.bilivideo.com', # 'Connection': 'keep-alive', # 'Access-Control-Allow-Origin':'https://www.bilibili.com', # 'Access-Control-Expose-Headers':'Content-Length,Content-Range', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3970.5 Safari/537.36', # 'Accept': '*/*', # 'Sec-Fetch-Site': 'cross-site', # 'Sec-Fetch-Mode': 'cors', 'Referer': 'https://www.bilibili.com/video/BV17J411h7NV?spm_id_from=333.851.b_62696c695f7265706f72745f67756f636875616e67.60', # 'Accept-Encoding': 'identity', # 'Accept-Language': 'zh-CN,zh;q=0.9', } ######Get Download url ####### r=requests.get(u,headers=headers) html1 = r.text s = re.findall(".*?class=\"tit tr-fix\">(.*?)</span>.*?", html1) l = re.findall('.*?window.__playinfo__=(.*?)</script>.*?', html1) ll = json.loads(l[0]) ls = ll['data']['dash']['video'][1]['backupUrl'] #video address lls = ll['data']['dash']['audio'][1]['backupUrl'] #audio address print('Downloading,Please wait a moment') ######download video ####### res= requests.get(ls[0],headers=headers) print(res) r = s[0]+'.mp4' with open(r,'wb') as fp: fp.write(res.content) fp.flush() fp.close() ######download audio ####### ress = requests.get(lls[0],headers=headers) print(ress) rr=s[0]+'audio frequency'+'.mp4' with open(rr,'wb') as fpp: fpp.write(ress.content) fpp.flush() fpp.close() print('Download successful')
4, Copyright notice
Finally, we enjoy the convenience brought by technology. At the same time, we should respect technology. Therefore, we must not illegally steal other people's copyright video for other commercial activities. Respect the up Lord's labor, we will get more happiness!
5, Previous articles
[zihi notes ා 1] public opinion analysis of microblog epidemic - crawling part
[zihi notes ා 2] public opinion analysis of microblog epidemic - public opinion analysis part
Analysis of houlang's comments
[zihi notes ා 4] Baidu Library paid articles - Web page analysis