Python Getting Aladdin Statistics

background

Currently the project is on the mobile side, using the WeChat applet is the first choice.It is necessary to collect and store data for accessing data by applets of various projects so as to facilitate subsequent statistical analysis.While the Aladdin backstage also provides trend analysis capabilities, it can be painful to get data and do data analysis one by one.By converting the data to sql and persisting it to the database, this provides the basis for subsequent data analysis and presentation.

Ideas for implementation

Aladdin products are divided into two product lines: platform and statistical platform. At present, the open platform has api and supporting documentation.The Statistics Platform api charges a fee and is expensive.Since there is no ready-made api to get data, let's try using python to grab data on a page, since python is good at doing this.

Get Data Flow
1. Log on to the statistical platform in Aladdin first, as shown below

We find that the key data we need to get are "Number of new users", "Number of visitors", "Number of visits", "Opens", "Average length of stay per session", "Jump-out rate", "Accumulated users". Moreover, there are ready-made data for yesterday, so we can crawl this page once a day and format the data for yesterday into the library.

2. Open the browser F12 and find that the data on this page comes from the following links

The sensitive information token is hidden here. By default, 20 pieces of data are displayed in pages. So the question is, where did the token get here?Guess that this token should be generated after the user logs in, and you can get the data as long as you have the correct token.

You can see in the preview tab that 50 pieces of data were actually returned

Expand the detailed data, hiding sensitive information app_key and app_name

Expand the first piece of data to see detailed statistical information, and now our goal is to get that information from python.

Login process

Let's take a look at the login process.After the browser exits the account, sign in again (login by secret), open the browser F12, and find the url address of the actual login request as follows

Notice the content-type and user-agent parameters, where does the sensitive data username, password, and secretkey come out?

No matter 3721, you did get token by first requesting this address with postman.

There is a code field in the payload of the previous login url, which can be ignored when actually being discovered with a postman request.The URL of the verification code is found to be spliced through F12, as shown in the following figure.

If the verification code is mandatory, we can stitch the URLs to get the pictures and identify them by pytesseract first. Of course, there may be accuracy problems, but since this is not necessary at the moment, let's ignore it.


code implementation

1,login.py

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import requests
import json

#Pin Alert Function
def dingtalk(content):
    dingtalk_url = 'https://oapi.dingtalk.com/robot/send?access_token="Please enter a pin token"'
    dingtalk_header ={"Content-Type": "application/json"}
    dingtalk_payload = {"msgtype": "text","text": {"content": "%s" %content}}
    requests.post(dingtalk_url, data=json.dumps(dingtalk_payload), headers=dingtalk_header)

#Get the secretkey function
def get_secretkey():
    token_url = 'http://betaapi.aldwx.com/m/Login_reg/Login/token'
    header = {"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
              "user-agent": "User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36"}
    req = requests.post(token_url,headers=header).text
    return json.loads(req).get("secretKey")

#Get token function
def get_token(secretkey):
    s = requests.Session()
    login_url = 'https://betaapi.aldwx.com/Main/action/Login_reg/Login/login'
    header = {"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
              "user-agent": "User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36"}

    payload = {"phone": "User name",
               "password": "Password",
               "source": "0",
               "plan": "0",
               "creative": "0",
               "keyword1": "0",
               "secretKey": secretkey}
    req = s.post(login_url, data=payload, headers=header).text
    if json.loads(req).get("code") == 200:
        return json.loads(req).get("data").get("token")
    else:
        dingtalk("Get Aladdin Login token Failed, please check!")
        return None

2,aldwx.py

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import requests
import json
from common.mysql_conn import DBAPI
from conf import settings
from login import get_token,get_secretkey,dingtalk

#Define Write Database Functions
def exec_sql(sql):
    my_conn = DBAPI(settings.Params['host'], settings.Params['user'], settings.Params['password'], int(settings.Params['port']), settings.Params['database'], settings.Params['charset'])
    my_conn.conn_dml(sql)

#Get statistics and stitch them into lists
def get_data(token):
    header = {"Content-Type": "application/x-www-form-urlencoded",
              "user-agent": "User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36"}
    url = 'http://betaapi.aldwx.com/upgrade/api/applet_homepage'
    payload = {'currentpage': '1', 'total': '80', 'token': token, 'appkey': '', 'is_demo': '0'}
    try:
            req = requests.post(url, data=payload, headers=header).text
            data = (json.loads(req).get("data"))
            sql_value = []
            for i in data:
                app_name = i.get('app_name')
                yesterday_data = i.get('countList')[1]
                create_time = yesterday_data.get('day')
                new_comer_count = yesterday_data.get('new_comer_count')
                visitor_count = yesterday_data.get('visitor_count')
                open_count = yesterday_data.get('open_count')
                total_page_count = yesterday_data.get('total_page_count')
                secondary_avg_stay_time = yesterday_data.get('secondary_avg_stay_time')
                bounce_rate = yesterday_data.get('bounce_rate')
                total_visitor_count = yesterday_data.get('total_visitor_count')
                value = "(\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")" % (app_name, create_time, new_comer_count, visitor_count, open_count, total_page_count,secondary_avg_stay_time,bounce_rate, total_visitor_count)
                sql_value.append(value)
            if sql_value != 'None':
                return sql_value
            else:
                dingtalk("Failed to get WeChat applet statistics,token: %s" %token)
    except Exception as e:
        dingtalk("Get WeChat Applet%s Statistics failed.%s" %e)

if __name__ == '__main__':
    secretkey=get_secretkey()
    token = get_token(secretkey)
    sql = """INSERT INTO operations_db.aldwx_stat (APP_NAME,CREATE_TIME,NEW_COMER_COUNT,VISITOR_COUNT,OPEN_COUNT,TOTAL_PAGE_COUNT,SECONDARY_AVG_STAY_TIME,BOUNCE_RATE,TOTAL_VISITOR_COUNT) VALUES"""
    value = get_data(token)
    #Stitching and converting sql
    value = str(value).strip("'[").strip("]'").strip().replace("', '",",")
    sql = "%s%s;" % (sql, value)
    print(sql)
    exec_sql(sql)

3. Storage Effect

Tags: Operation & Maintenance SQL JSON Python Database

Posted on Sat, 09 May 2020 14:55:27 -0400 by fiddlehead_cons