The first and second levels of reptile in blackboard class

The first level

Link: http://www.heibanke.com/lesson/crawler_ex00/

According to the number given by the webpage, the number after the website is updated until the prompt enters the next level. The general idea is to use python regular expression to get the number in the web page, and then add the number to the web address and repeat it all the time

After manually inputting several numbers, we found that



Except for the number xxxxx, which is entered for the first time, it is XXXXX

The following regular expression can be used to extract numbers from web pages

r 'number [^ \ D] * (\ D +) [\. <]

Then use re module and requests module to write script

#coding=utf-8

import requests
import re

url = 'http://www.heibanke.com/lesson/crawler_ex00/'


r = requests.get(url).content

number = re.findall(r'number[^\d]*(\d+)[\.<]',r)

index = 1

while number:
    website = url + number[0]

    r = requests.get(website).content

    number = re.findall(r'number[^\d]*(\d+)[\.<]',r)

    print website

    index += 1
else:
    print "End"

The second level


The general idea is to use the requests module to post the user and password, because the password is within 30, you can write a for loop to assign the password

#coding=utf-8

import requests
import re

url = 'http://www.heibanke.com/lesson/crawler_ex01/'

wrongNotify = 'The password you entered is wrong, Please re-enter'

for index in range(1,31):

    while True:
        data = {'username': 'aha', 'password': index}
        html = requests.post(url, data).content
        if wrongNotify in html:
            print "The first%sVisits,Password%serror" % (index, index)
            break
        print "The first%sVisits,The code is: %s" % (index,index)

        index += 1

You can also keep the numbers going in circles and get the right answers

#coding=utf-8
import requests

wrongNotify = 'The password you entered is wrong, Please re-enter'

website = 'http://www.heibanke.com/lesson/crawler_ex01/'

index = 1

while True:

    data = {'username': 'Thare', 'password': index}

    html = requests.post(website, data).content

    if wrongNotify not in html:

        print "\n The code is: %d" % index

        break

    print "The first%dVisits,Password%derror" % (index, index)

    index += 1

Tags: Python

Posted on Tue, 05 May 2020 18:46:13 -0400 by titeroy