π’π’π’π£π£π£
π»π»π» Hello, everyone. My name is Dream. I'm an interesting Python blogger. I have a small white one. Please take care of it πππ
π
π
π
CSDN is a new star creator in Python field. I'm a sophomore. Welcome to cooperate with me
π Introduction note: this paradise is never short of genius, and hard work is your final admission ticket! πππ
π Finally, may we all shine where we can't see and make progress together πΊπΊπΊ
πππ "Ten thousand times sad, there will still be Dream, I have been waiting for you in the warmest place", singing is me! Ha ha ha~ πππ
πππβ¨β¨β¨
The request s library is actually similar to the urlib library, but the urlib library is a little outdated, so it is generally used. Let's learn about it
1, Basic use
1. Use documents
Official documents
http://cn.pythonβrequests.org/zh_CN/latest/
Get started quickly
http://cn.pythonβrequests.org/zh_CN/latest/user/quickstart.html
2. Installation
pip install requests
After the installation is successful, there will be a prompt of successful. After the installation, there will be a prompt of Requirement already satisfied:
3. Attribute and type of response
1. Type
```html import requests url = 'https://www.baidu.com/' response = requests.get(url = url) # One type has six attributes # Response type print(type(response))
<class 'requests.models.Response'>
2. Return the source code of the web page in the form of string
# Return the source code of the web page in the form of string print(response.text)
3. Return a url address
# Return a url address print(response.url)
https://www.baidu.com/
4. Binary data is returned
# Binary data is returned print(response.content)
5. Return the status code of the response
# Returns the status code of the response print(response.status_code)
200
6. The response header is returned
# The response header is returned print(response.headers)
2, Simply compare urllib and requests
1.urllib
# (1) One type and six methods # (2)get request # (3)post request Baidu translation # (4) get request for ajsx # (5)ajax post request # (6) Log in to microblog with cookie # (7) Agency
2.requests
# (1) One type has six attributes # (2)get request # (3)post request # (4) Agency # (5)cookie verification code
3, Application of requests method
1. get requests for requests
(1) Request Baidu interface
import requests url = 'https://www.baidu.com/s' headers={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36' } data = { 'wd': 'Beijing' } # url request resource path # params parameter # kwargs dictionary response = requests.get(url=url, params=data, headers=headers) content = response.text print(content)
(2) Summary of characteristics
1. Parameters are passed using params
2. Parameters do not need urlencode code
3. There is no need to request object customization
4. In the request resource path? You can add it or not
2. post request of requests
(1) Request Baidu translation
# -*-coding:utf-8 -*- # @Author: it's time. I love brother Xu # Ollie, do it!!! import requests import json url = 'https://fanyi.baidu.com/sug' headers={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36' } data = { 'kw':'eye' } # url request address # data request parameters # kwargs dictionary response = requests.post(url=url, data=data, headers=headers) content = response.text print(content) obj = json.loads(content,encoding='utf-8') print(obj)
(2) Summary of characteristics
1.post requests do not require encoding and decoding
2. The parameter of the post request is data
3. Customization of the request object is not required
3. cookie proxy for requests
(1) Log in to ancient poetry website
1. Open the ancient poetry website:
Ancient poetry network
2. Login interface:
# Login interface url = 'https://so.gushiwen.cn/user/login.aspx?from=http://so.gushiwen.cn/user/collect.aspx' headers={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36' }
3. Get the source code of the page
# Get the source code of the page: response = requests.get(url=url,headers=headers) content = response.text
4. Parse the page source code, and then get__ VIEWSTATE 'and'__ VIEWSTATEGENERATOR’
# Parse the page source code, and then get '__ VIEWSTATE 'and'__ VIEWSTATEGENERATOR' from bs4 import BeautifulSoup soup = BeautifulSoup(content,'lxml') # 'get'__ VIEWSTATE' viewstate = soup.select('#__VIEWSTATE')[0].attrs.get('value') # 'get'__ VIEWSTATEGENERATOR' viestategener = soup.select('#__VIEWSTATEGENERATOR')[0].attrs.get('value')
5. Obtain the verification code picture
# Get verification code picture code = soup.select('#imgCode')[0].attrs.get('src') code_url =' https://so.gushiwen.cn'+code
6. After obtaining the picture verification code, save it locally, and then observe the verification code for input.
# After obtaining the picture verification code, save it locally, and then observe the verification code for input. # There is a session() method in requests. You can use the return value of the session to turn the request into an object session = requests.session() # The content of the url of the verification code response_code = session.get(code_url) # Note that binary data should be used at this time content_code = response_code.content # wb's mode is to write binary data to a file with open('code.jpg','wb')as fp: fp.write(content_code) code_name = input('Please enter your verification code:'),
7. Click login
url_post = 'https://so.gushiwen.cn/user/login.aspx?from=http%3a%2f%2fso.gushiwen.cn%2fuser%2fcollect.aspx' data_post = { '__VIEWSTATE':viewstate , '__VIEWSTATEGENERATOR':viestategener , 'from': 'http://so.gushiwen.cn/user/collect.aspx', 'email': '18300396393', 'pwd': '20020102XYPxyp', 'code': code_name, 'denglu':'Sign in', } response_post = session.post(url=url,headers=headers,data=data_post) content_post = response_post.text with open('gushiwen.html','w',encoding='utf-8') as fp: fp.write(content_post)
8. Obtain dynamic verification code
9. Open the obtained website:
get into:
Successfully sprinkle flowers!
(2) Difficulties
1. Hidden fields
2. Verification code
4, Automatic identification verification code
1. First find the super Eagle website:
Available account and password: Account: action password: action
2. Then find Python in the development document:
After entering, download the Python language Demo.
3. Modify code
Put the downloaded Demo into our project file and observe its code:
1. Replace this with our user name and code
2. Follow the prompts to replace our id:
3. Generate our own software id:
4. Finally, add () after print!
5. By returning the dictionary, we can find the value of our verification code through the corresponding relationship between key value pairs:
4. Source code sharing:
#!/usr/bin/env python # coding:utf-8 import requests from hashlib import md5 class Chaojiying_Client(object): def __init__(self, username, password, soft_id): self.username = username password = password.encode('utf8') self.password = md5(password).hexdigest() self.soft_id = soft_id self.base_params = { 'user': self.username, 'pass2': self.password, 'softid': self.soft_id, } self.headers = { 'Connection': 'Keep-Alive', 'User-Agent': 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)', } def PostPic(self, im, codetype): """ im: Picture byte codetype: Topic type reference http://www.chaojiying.com/price.html """ params = { 'codetype': codetype, } params.update(self.base_params) files = {'userfile': ('ccc.jpg', im)} r = requests.post('http://upload.chaojiying.net/Upload/Processing.php', data=params, files=files, headers=self.headers) return r.json() def ReportError(self, im_id): """ im_id:Pictures of wrong topics ID """ params = { 'id': im_id, } params.update(self.base_params) r = requests.post('http://upload.chaojiying.net/Upload/ReportError.php', data=params, headers=self.headers) return r.json() if __name__ == '__main__': chaojiying = Chaojiying_Client('action', 'action', '925358') #The user center > > software ID generates a replacement 96001 im = open('a.jpg', 'rb').read() #Local image file path to replace a.jpg. Sometimes WIN system needs to// print(chaojiying.PostPic(im, 1902).get('pic_str')) #1902 verification code type official website > > price system version 3.4 + print should be added ()
β€οΈ Previous articles recommended β€οΈ:
Python crawler β€οΈ Urllib usage collection—— β‘ One click easy entry crawler β‘
π²π²π² Well, that's all I want to share with you today
β€οΈβ€οΈβ€οΈ If you like, don't save your one button three connections~