Analysis of the problems in the web crawler attack and defense competition of ape man Science_ Question 18: jsvmp- insight into opportunities

1, Foreword

I haven't been blogging for a long time. Today, I finally took the time to solve the last problem. So far, the learning of the ape man learning competition problem has basically come to an end. Generally speaking, most of the problems are very challenging for me, and I will learn some new knowledge during each problem-solving process. At present, the level reached is that even if there are some encryption without ideas at all, I can basically explore and restore the problem-solving process shared by other gods. Although there is still a big gap from the big guys, it is still a certain progress compared with me who challenged the exercise at the beginning. In short, I have completed the challenge goal of the ape man learning competition. Well, I don't talk much nonsense and get to the point.

2, Analytical process analysis

It is preferred to enter the home page. It is found that the content of this question is not much different from other questions in the past. It is all about grabbing numbers. However, this question clearly states that it is jsvmp encryption. After searching on the Internet, no good tutorial materials can be found. In short, it can be determined that the encryption is difficult to crack, and then it can be determined that the encryption code runs in the virtual machine environment, which is not so easy to debug.
Open the debugging mode and observe the request content. It is found that there are no encryption parameters when requesting the first page. Starting from the second page, the url will carry t and v parameters. Here v should be an encrypted string.
Continue to observe the source code of the request home page and find such a large string of hard looking js code. It is speculated that the encryption parameters should be generated here.

After some debugging, it is found that the return value of the final request number is obtained after executing the function:'GET', location.href + 'data?page=' + page, true); is actually confusing y in the code__ Function, that is, the encryption parameters should be generated here.

Look at the variable u__ Is an array. You can see the words AES in its second element. It can be seen that AES encryption is used.
Although we know what encryption is, it's really difficult to debug here. We can't track the call stack like many conventional encryption to see which functions are called before AES encryption, because most of the code will jump back to this function during debugging. Here, we refer to the idea shared by a big man, that is, hook the AES encryption function, Print out relevant encrypted information during encryption:
After successfully hook to the encryption function, print several parameters passed in by the encryption function, and then find that the value of the text to be encrypted is the page number plus a string with special rules, and the values of key and iv are the same.

So how are these parameters generated? At this time, you can look at the call stack, look at the call stack, go up, and enter this_ y__ Function:
Click to enter and jump to the line under the figure. You can see that the encryption function of hook is here.
After finding the encryption location, it's still not easy to observe how the parameters are transmitted, so set a log breakpoint here to automatically output the parameters we want to observe when the code runs here.

Then when the mouse slides over the home page, you can find that the console will keep outputting new content
Combined with the u found during early commissioning__ The mice object in the array can determine that the contents of the array are the moving track value of the mouse. This track value is random, so you can arbitrarily determine a group and form the text value to be encrypted with the requested page number.

After the text value is determined, the rest is the generation logic of key and iv values. Here is a simple way to observe the contents output in the log breakpoint during debugging. From here, you can see some suspicious objects. For example, during debugging, a string with obvious time stamp is output. After some processing of the time stamp, the final key and iv values are obtained.

It can be observed here that the key and iv values are obviously the sum of two identical strings, and this string is converted based on the timestamp. After verification, it is found that this is actually the result of converting the timestamp to hexadecimal and removing the first two 0x. Therefore, the key and iv values are also determined, and the encryption point of this problem is found.

At first glance, it seems that the parsing process is not difficult, but in fact, the most difficult thing is the processing idea. For example, can you think of hook AES function at the beginning, and then use log breakpoints to observe encryption parameters, which is also difficult to think of when you are not familiar with debugging tools. If you rely on conventional call stack to find encryption logic, It's hard to figure out the encryption logic even if you're bald. Of course, this is only for our entry-level people. Maybe experts will have a better solution.

3, Code implementation

After understanding the encryption logic, the code implementation is relatively simple. You can directly use the AES encryption library of python:

from Crypto.Util.Padding import pad, unpad
from Crypto.Cipher import AES
import time
import base64
import requests

headers = {
    'authority': '',
    'pragma': 'no-cache',
    'cache-control': 'no-cache',
    'sec-ch-ua': '"Chromium";v="94", "Google Chrome";v="94", ";Not A Brand";v="99"',
    'sec-ch-ua-mobile': '?0',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36',
    'sec-ch-ua-platform': '"Windows"',
    'accept': '*/*',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-mode': 'cors',
    'sec-fetch-dest': 'empty',
    'referer': '',
    'accept-language': 'zh-CN,zh;q=0.9',
    'cookie': 'Hm_lvt_c99546cf032aaa5a679230de9a95c7db=1635170677,1635258075,1635304521,1635306736; no-alert3=true; Hm_lpvt_c99546cf032aaa5a679230de9a95c7db=1635339533',

for i in range(1,6):
    #A text value composed of a mouse track is constructed, and the mouse movement track can be fixed
    byte_text = bytes(text, encoding="utf-8")#Convert to byte form

    #Convert timestamp to KEY,iv
    t = int(time.time())
    byte_iv=bytes(iv, encoding="utf-8")

    # Initialize the encryptor and use CBC mode for encryption
    cryptor, AES.MODE_CBC,byte_iv)

    params = (
        ('page', str(i)),
        ('t', str(t)),
        ('v', str(enc_result,'utf-8')),

    response = requests.get('', headers=headers, params=params)

The request output results are as follows:

{'status': '1', 'state': 'success', 'data': [{'value': 8944}, {'value': 4564}, {'value': 7199}, {'value': 8411}, {'value': 1811}, {'value': 2058}, {'value': 131}, {'value': 3398}, {'value': 115}, {'value': 3819}]}
{'status': '1', 'state': 'success', 'data': [{'value': 5183}, {'value': 7979}, {'value': 5907}, {'value': 5889}, {'value': 7532}, {'value': 5075}, {'value': 3963}, {'value': 9235}, {'value': 4401}, {'value': 2151}]}
{'status': '1', 'state': 'success', 'data': [{'value': 6036}, {'value': 8475}, {'value': 8476}, {'value': 9654}, {'value': 2602}, {'value': 9780}, {'value': 3552}, {'value': 8938}, {'value': 8192}, {'value': 350}]}
{'status': '1', 'state': 'success', 'data': [{'value': 9945}, {'value': 7245}, {'value': 6307}, {'value': 2850}, {'value': 8055}, {'value': 1846}, {'value': 3398}, {'value': 6228}, {'value': 2306}, {'value': 2003}]}
{'status': '1', 'state': 'success', 'data': [{'value': 8147}, {'value': 220}, {'value': 2509}, {'value': 9079}, {'value': 2371}, {'value': 6805}, {'value': 6272}, {'value': 5763}, {'value': 7494}, {'value': 7466}]}

4, References

1. Solution to problem 18 of ape Anthropology (jsvmp)
2. vvv big man's jsvm (APE anthropology 18 and new version)

Tags: Javascript Front-end crawler

Posted on Wed, 27 Oct 2021 09:38:40 -0400 by Chezshire