JS reverse Hook, eating hot pot and singing songs, was suddenly robbed by Ma bandits!

What is Hook?

Hook is actually a system mechanism provided in windows to replace "interrupt" under DOS. The concept of hook is very common in Windows desktop software development, especially the mechanism triggered by various events. Once a hook event occurs after a specific system event is hook, The program that hooks the event will receive the notification from the system, and then the program can respond to the event at the first time. It may be better to understand it as "hijacking" in the program. We can hijack an object through hook technology, pull out the program of an object and replace it with our own rewritten code fragment, modify parameters or replace the return value, so as to control its interaction with other objects.

Generally speaking, hook is actually a robbery in the way. Ma Bangde took his wife out of the city, ate hot pot and sang songs. Suddenly, he was robbed by Ma bandits. Zhang Mazi robbed Ma Bangde's train, turned into a county magistrate, and rushed to Echeng to take office with his men. Hook's process is the process of pockmarked Zhang replacing Ma bond.

Hook in JS reverse

In JavaScript reverse, the process of replacing the original function can be called Hook. The following is a simple code to understand the Hook process:

function a() {
  console.log("I'm a.");
}

a = function b() {
  console.log("I'm b.");
};

a()  // I'm b.

It is the simplest way to directly overwrite the original function. The above code rewrites function a, and calling function a again will output I'm b. If you still want to execute the contents of the original function a, you can use intermediate variables for storage:

function a() {
  console.log("I'm a.");
}

var c = a;

a = function b() {
  console.log("I'm b.");
};

a()  // I'm b.
c()  // I'm a.

At this time, calling a function will output I'm b. calling c function will output I'm a.

This method of direct coverage of the original function is usually only used for temporary debugging, which is not practical, but it can help us understand the Hook process. In the actual JS reverse process, we will use more advanced methods, such as Object.defineProperty().

Object.defineProperty()

Basic syntax: Object.defineProperty(obj, prop, descriptor). Its function is to directly define a new property on an object or modify an existing property of an object. The meanings of the three parameters received are as follows:

obj: the current object whose attribute needs to be defined;

prop: attribute name to be defined currently;

Descriptor: attribute descriptor, which can take the following values:

Attribute nameDefault valuemeaning
getundefinedAccess descriptor, a method to get the value of the target attribute
setundefinedAccess descriptor, method of setting value of target attribute
valueundefinedData descriptor to set the value of the property
writablefalseData descriptor, whether the value of the target attribute can be overridden
enumerablefalseCan target properties be enumerated
configurablefalseCan the target attribute be deleted or can the property be modified again

Generally, the definition and assignment of objects are as follows:

var people = {}
people.name = "Bob"
people["age"] = "18"

console.log(people)
// { name: 'Bob', age: '18' }

Use the Object.defineProperty() method:

var people = {}

Object.defineProperty(people, 'name', {
   value: 'Bob',
   writable: true  // Can it be overridden
})

console.log(people.name)  // 'Bob'

people.name = "Tom"
console.log(people.name)  // 'Tom'

In Hook, the most used access descriptors are get and set.

get: the getter function of the property. If there is no getter, it is undefined. When accessing the property, this function will be called. No parameters will be passed in during execution, but this object will be passed in (this here is not necessarily the object defining the property due to inheritance). The return value of the function will be used as the value of the property.

set: the setter function of the property. If there is no setter, it is undefined. When the property value is modified, this function will be called. This method accepts a parameter, that is, the new value assigned, and will be passed into the this object at the time of assignment.

Use an example to demonstrate:

var people = {
  name: 'Bob',
};
var count = 18;

// When defining an age to get the value, return the defined variable count
Object.defineProperty(people, 'age', {
  get: function () {
    console.log('Get value!');
    return count;
  },
  set: function (val) {
    console.log('Set value!');
    count = val + 1;
  },
});

console.log(people.age);
people.age = 20;
console.log(people.age);

Output:

Get value!
18
 Set value!
Get value!
21

In this way, we can add some code when setting a value, such as debugger;, Let it be disconnected, and then use the call stack for debugging to find the place where the parameters are encrypted or generated. It should be noted that when the website is loaded, we must first run our Hook code, and then run the website's own code before it can be successfully disconnected. This process can be called Hook code injection. Several mainstream injection methods will be introduced below.

Several methods of Hook injection

The following is based on the information in a Qiyi cookie__ Take dfp value as an example to demonstrate how to inject Hook.

1. Fiddler plug-in injection

When you come to the home page of a strange art, you can see that there is a in its cookie__ dfp value:

If you can't find it by direct search, we want to use Hook to generate__ When the dfp value is broken, you can write the following self-executive function:

(function () {
  'use strict';
  var cookieTemp = '';
  Object.defineProperty(document, 'cookie', {
    set: function (val) {
      if (val.indexOf('__dfp') != -1) {
        debugger;
      }
      console.log('Hook Capture cookie set up->', val);
      cookieTemp = val;
      return val;
    },
    get: function () {
      return cookieTemp;
    },
  });
})();

If (val. indexof ('_dfp')! = - 1) {debugger;} means retrieval__ The position where DFP first appears in the string is equal to - 1, indicating that the string value does not appear, otherwise it appears. If so, disconnect the debugger. Note that it cannot be written as if (Val = = '_dfp') {debugger}, because the value passed by Val is similar to__ DFP = xxxxxxxxx, which cannot be disconnected.

With the code, how to use it? That is, how to inject hook code? Collocation tool is official account. The Fiddler capture tool is recommended for the plug-in of programming cat. The plug-in can be obtained by entering the key word [Fiddler plug-in] in public key. Its principle can be understood as a process of intercept > processing > release. Fiddler is used to replace the response. After Fiddler intercept to data, insert Hook code in the first row of source code, because Hook code is a self executing function. Then once the web page is loaded, it is bound to run the hook code first. After the installation is completed, as shown in the figure below, open the packet capture and click to open the injection hook:

After the browser clears the cookie and re enters the page of a Qiyi, you can see the successful disconnection. You can see some cookie values captured on the console. At this time, val is__ The value of dfp. Next, you can see the calling process of some functions in the Call Stack on the right. Follow up in turn to find the beginning__ Where dfp is generated.

2. TamperMonkey injection

TamperMonkey, commonly known as the oil monkey plug-in, is a free browser extension and the most popular user script manager. It supports many mainstream browsers, including Chrome, Microsoft Edge, Safari, Opera, Firefox, UC browser, 360 browser, QQ browser, etc. it basically realizes one-time script writing and can run on all platforms, It can be said that browser based applications are truly cross platform. Users can directly obtain scripts published by others on platforms such as GreasyFork and OpenUserJS. They have many and powerful functions, such as video analysis, de advertising, etc.

We still take a cookie of Qiyi as an example to demonstrate how to write TamperMonkey script. First, go to the app store to install TamperMonkey. The installation process will not be repeated. Then click the icon to add a new script, or click the management panel, and then click the plus sign to create a new script, and write the following code:

// ==UserScript==
// @name         Cookie Hook
// @namespace    http://tampermonkey.net/
// @version      0.1
// @Description cookie hook script example
// @Author K brother reptile
// @match        *
// @icon         https://www.kuaidaili.com/img/favicon.ico
// @grant        none
// @run-at       document-start
// ==/UserScript==

(function () {
  'use strict';
  var cookieTemp = '';
  Object.defineProperty(document, 'cookie', {
    set: function (val) {
      if (val.indexOf('__dfp') != -1) {
        debugger;
      }
      console.log('Hook Capture cookie set up->', val);
      cookieTemp = val;
      return val;
    },
    get: function () {
      return cookieTemp;
    },
  });
})();

The JavaScript self-executing function of the main body is the same as the previous one. Note here that the first annotation is meaningful, and all options are referenced TamperMonkey official document , some common and important options are listed below (especially the @ match and @ run at options):

optionmeaning
@nameThe name of the script
@namespaceNamespace, which is used to distinguish scripts with the same name. Generally, you can write the author's name or web address
@versionScript version. The update of oil monkey script will read this version number
@descriptionDescribe what this script is for
@authorThe name of the author who wrote the script
@matchOnly the matching URL will execute the corresponding script and regular expression, such as * match all, *. baidu. * match baidu, etc. multiple instances are allowed
@iconicon for script
@grantSpecify the permissions required for the script to run. If the script has corresponding permissions, you can call the API provided by the oil monkey extension to interact with the browser. If it is set to none, the sandbox environment will not be used, and the script will run directly in the web environment. At this time, most of the oil monkey extended APIs cannot be used. If not specified, the oil monkey will add several most commonly used APIs by default
@requireIf the script depends on other JS libraries, you can use the require instruction to import and load other libraries before running the script
@run-atScript injection timing, which is the key to whether you can hook. There are five optional values: document start: when the web page starts; Document body: when body appears; Document end: executed during or after loading; Document idle: execute after loading; default option; Context menu: when you click the script in the browser context menu, it is generally set to document start
@includeSimilar to @ match, please refer to TamperMonkey official document

Clear the cookie, open the TamperMonkey plug-in, and come to the home page of a Qiyi again. You can see that it has also been successfully disconnected. Similarly, you can follow up the call stack for further analysis__ The source of the dfp value.

3. Browser plug-in injection

The official name of the browser plug-in should be browser Extension. The browser plug-in can enhance the browser function and also help us Hook. The preparation of the browser plug-in is not complicated. Taking the Chrome plug-in as an example, just ensure that there is a manifest.json file under the project, which is used to set all plug-in related configurations and must be placed in the root directory. Including manifest_version, name and version are essential parameters. If you want to learn more, you can refer to them Xiao Ming Blog and Google official documents . It should be noted that the Firefox plug-in can only run on the Firefox browser, and the chrome plug-in can run on all domestic browsers with webkit kernel, such as 360 speed browser, 360 security browser, Sogou browser, QQ browser, etc. We still use a cookie of Qiyi to demonstrate how to write a Hook plug-in for Chrome browser.

Create a new manifest.json file:

{
    "name": "Cookie Hook",          // Plug in name
    "version": "1.0",               // Plug in version
    "description": "Cookie Hook",   // Plug in description
    "manifest_version": 2,          // List version, must be 2 or 3
    "content_scripts": [{
        "matches": ["<all_urls>"],  // Match all addresses
        "js": ["cookie_hook.js"],   // The injected code file name and path. If there are multiple, inject them in sequence
        "all_frames": true,         // Allow content scripts to be embedded in all frames of the page
        "permissions": ["tabs"],    // Permission application, tabs indicates tags
        "run_at": "document_start"  // Time of code injection
    }]
}

New cookie_hook.js file:

var hook = function() {
    'use strict';
    var cookieTemp = '';
    Object.defineProperty(document, 'cookie', {
        set: function(val) {
            if (val.indexOf('__dfp') != -1) {
                debugger;
            }
            console.log('Hook Capture cookie set up->', val);
            cookieTemp = val;
            return val;
        },
        get: function() {
            return cookieTemp;
        },
    });
}
var script = document.createElement('script');
script.textContent = '(' + hook + ')()';
(document.head || document.documentElement).appendChild(script);
script.parentNode.removeChild(script);

Put the two files in the same folder, open the extension program of chrome, open the developer mode, load the decompressed extension program, and select the created folder:

When you come to a Qiyi page, clear the cookie and re-enter, you can see that it is also successfully disconnected. You can find the place where its value is generated by tracking the call stack:

Other Hook code references

In addition to using the above Object.defineProperty() method, you can also directly capture the relevant interface and then rewrite the interface. The common Hook codes are listed below. Note: the following is only the key Hook code. The specific injection methods are different and need to be modified.

Cookie Hook

Cookie Hook is used to locate the generation location of key parameters in the cookie. The following code demonstrates that when the cookie matches__ dfp keyword, insert breakpoint:

(function () {
    'use strict';
    var org = document.cookie.__lookupSetter__('cookie');
    document.__defineSetter__('cookie', function (cookie) {
        if (cookie.indexOf('__dfp') != -1) {
            debugger;
        }
        org = cookie;
    });
    document.__defineGetter__('cookie', function () {
        return org;
    });
})();

Header Hook

The Header Hook is used to locate the generation location of key parameters in the Header. The following code demonstrates the insertion of breakpoints when the Header contains the Authorization keyword:

(function () {
    var org = window.XMLHttpRequest.prototype.setRequestHeader;
    window.XMLHttpRequest.prototype.setRequestHeader = function (key, value) {
        if (key == 'Authorization') {
            debugger;
        }
        return org.apply(this, arguments);
    };
})();

URL Hook

URL Hook is used to locate the generation location of key parameters in the request URL. The following code demonstrates that when the request URL contains the login keyword, the breakpoint is inserted:

(function () {
    var open = window.XMLHttpRequest.prototype.open;
    window.XMLHttpRequest.prototype.open = function (method, url, async) {
        if (url.indexOf("login") != 1) {
            debugger;
        }
        return open.apply(this, arguments);
    };
})();

Tags: Python Javascript crawler hook

Posted on Fri, 10 Sep 2021 03:35:29 -0400 by PHPcoder25