After using Python to climb data and FineBI for analysis, I found that Taobao's mobile phone was so cheap

Recently, I want to start a new mobile phone. Considering that domestic mobile phones are different now and in the past, both the sales volume and the right of speech are of sufficient weight in the world. Huawei in Europe, Xiaomi in India and Canada in the United States, domestic mobile phones occupy the mobile phone market all over the world at a very rapid speed. As a loyal user who has always supported domestic mobile phones, combined with the basic knowledge of domestic mobile phones, we are ready to make a choice from Huawei, Xiaomi, OPPO and VIVO.

In order to reflect the sales volume, price and other real situations of the four mobile phone products market with data, because Python language is simple and convenient, Xiaobian is going to use Python to crawl the data related to mobile phones on "a treasure" platform.

In terms of data visualization analysis, although numpy, pandas, matplotlib and other third-party libraries provided by Python are used to calculate and process the data, and finally generate the required visualization reports, the charts made lack of dynamic interaction, and the chart style and attribute settings are more cumbersome, and it is inconvenient to conduct in-depth OLAP multidimensional analysis. Therefore, FineBI tool is directly used here To carry out data analysis and statistics on the mobile phone data crawled from "a treasure" platform.

Principle introduction


The appeal this time is very simple. It is to see the sales ranking and price of domestic machines.

The data layer will first grab the WEB page data through Python, then parse the crawled data and store it in MYSQL data warehouse. Finally, the data processing, data calculation and statistics, chart visualization and presentation of the application layer are all completed by our FineBI tool.

Operation steps

1. Introduce the relevant Python library package and write MySQL data warehousing function


First, create a new python project, introduce the four related database packages of pandas, re, request and pymysql needed for crawling web data and writing MySQL database

import pandas

import re

import requests

import pymysql

def ExecuteSQL(title,price,sales): #Write data to mysql database

conn = pymysql.connect(host='', port=xxxx, user='xxxx', passwd='xxxx', db='xxxx', charset='utf8') #The corresponding xx parameter value can be modified to its own database

cursor = conn.cursor()

# Cursor. Execute ('create table Mobile 'data (brand varchar(100), price double, sales int))

# cursor.execute('DROP TABLE MOBILE_DATA')

cursor.execute("INSERT INTO MOBILE_DATA(brand,Price,Sales volume) VALUES (\'%s\',%d,\'%d\')"%(title,price,sales)) #Perform SQL data insertion

print('Data inserted successfully!')


conn.commit() #Commit execution naming

cursor.close() #Release cursor object

conn.close() #Release database connection object

2. Access to web data

Then, as shown in the figure below, define the value of the simulated browser to access the header, and obtain the request information in the platform web page of "a treasure" by writing Python code and using requests:

for page in range(1,7): (7 pages in total)

url = ''

header = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36',

'cookie':'thw=cn; t=be73ea5ec1ffbeb254d0a3535dd00415; cna=HqWrEpIZeG4CAbYSAEIb6bav; hng=CN%7Czh-CN%7CCNY%7C156; miid=596160490770762658; lgc=%5Cu5815%5Cu843D%5Cu4E4B%5Cu6CEAa; tracknick=%5Cu5815%5Cu843D%5Cu4E4B%5Cu6CEAa; tg=0;; x=e%3D1%26p%3D*%26s%3D0%26c%3D0%26f%3D0%26g%3D0%26t%3D0%26__ll%3D-1%26_ato%3D0; uc3=sg2=VWxidJMT8gLCYBc%2BxP5FJdYe9%2FXfUvq2%2Byf0cFWq90Q%3D&nk2=1RSXayUHM0Sl&id2=UUpkvTJ9k5HsSA%3D%3D&vt3=F8dBzLbVzPYkPml1NZk%3D&lg2=W5iHLLyFOGW7aA%3D%3D; uss=VvioJOfdaT365u5YugXSKrRnG47jUQQG9UQvstfUu5fjcHD0zxGQLEmn; _cc_=VFC%2FuZ9ajQ%3D%3D; mt=ci=67_1; tk_trace=oTRxOWSBNwn9dPy4KVJVbutfzK5InlkjwbWpxHegXyGxPdWTLVRjn23RuZzZtB1ZgD6Khe0jl%2BAoo68rryovRBE2Yp933GccTPwH%2FTbWVnqEfudSt0ozZPG%2BkA1iKeVv2L5C1tkul3c1pEAfoOzBoBsNsJySQJwqIKz2kX83uPP5e4iE9t1ZpHdHZkk218jfUuTKISIEGrGMtBctY%2B2vMCmzCRVhIqleLIl%2BRRQHs4ekW3wNcZhDfwkkQzp9RF7kjYiNbNLTbo2mRCr3Wf97aW%2FfC72uuEf9Tcc6cNT9QCiB0y7NxqzS4M5NvMkxl5KoKbA%2BorLqu5Y9jpCfT31RlA%3D%3D; cookie2=1c16eb46ef00c015dd101f731c258d77; _tb_token_=8de4c4560b63; v=0;;; swfstore=107855; JSESSIONID=ED726367865542B7BA84D801D1C72812; isg=AhcXOlKpAS4SKIXa0x_6AhsZpovNTcSrwSKOp2lEKOZNmDfacSx7DtWyjg59; uc1=cookie14=UoTdf1DFLRnICg%3D%3D',


'path':'/search?q=%E6%89%8B%E6%9C%BA&imgfile=&commend=all&ssid=s5-e&search_type=item&sourceId=tb.index&spm=a21bo.2017.201856-taobao-item.1&ie=utf8&initiative_id=tbindexz_20170306'} #Define the value of the simulated browser accessing the header

html = requests.request('GET', url, headers=header) #Access to web page request information

3.HTML tag parsing (Script format)

At this time, we can view the webpage information of Huawei, Xiaomi, OPPO and VIVO mobile phones. Check the relevant codes with F12 tool provided by the browser. We can see that the product data information of "a treasure" was originally stored in Script variables.


Next, we only need to use re to search the data information of the whole web page according to the specified label format, and then store the object in data:

ren = re.compile('"title":"(.*?)","pic_url":"(.*?)","price":"(.*?)","trace":"(.*?)","month_sales":"(.*?)"')

data = re.findall(ren, html.text)

4.MySQL data warehousing

After the data is parsed, write the parsed data to the MySQL database:

data2 = pandas.DataFrame(data) 񖓿 it is convenient to convert data object to DataFrame type

for rows in range(1,data2.shape[0]): ා loop through all row data in DataFrame

Executesql (data2. Values [rows] [0], float (data2. Values [rows] [2]), int (data2. Values [rows] [4]))

count=count+1 - counter accumulation

print('congratulations, all data has been crawled, a total of% d data! '% (count))

Loop through the relevant information data of Huawei, Xiaomi, OPPO and VIVO, the four platforms of a treasure, with 7 pages, totaling 282 pieces of data.


5. Validation data warehousing

Add SQL datasets (or add tables and rows directly) through the data configuration function provided by FineBI to check whether the data just crawled and entered has been successfully entered into MySQL.

As shown in the figure below, Python is indeed worthy of its mission. The data of Huawei, Xiaomi, OPPO and VIVO, the "treasure" platforms I want, have been successfully written into my MySQL database.


6. Visual analysis

There are several dimensions:

  • Overall sales of domestic four brands of mobile phones sales ranking of domestic mobile phones

The indicators involved are also relatively simple. Basically, they can be visualized by dragging data fields through FineBI.

The following dynamic chart, taking the cloud chart of sales statistics of four domestic mobile phones as an example, gives you a simple display of the visualization process.

(if you want to make statistics according to the big brands of mobile phones, you can directly group the brand fields with FineBI)


It took the editor 10 minutes to present the basic analysis architecture, and then another 30 minutes to beautify it with some visual elements. What I want to see is the sales information of Huawei, Xiaomi, OPPO and VIVO in the form of visualization. (average price / total sales ranking of the four domestic mobile phones, price / sales cloud chart of each brand of the four domestic mobile phones, price / sales top 10 of each brand of the four domestic mobile phones, sales distribution map, etc.)


Analysis result

1. In Taobao platform, the total sales volume of Huawei, Xiaomi, OPPO and VIVO mobile phones is 7.51 million, with a total sales volume of 14.297 billion. Among them, Huawei brand accounted for 44.40% of the total sales, Taobao sales reached 6.184 billion, Xiaomi, VIVO, OPPO and other three brands accounted for 28.98%, 17.90% and 8.72% respectively.

2. In terms of the average price of the four domestic mobile phones, VIVO and Huawei ranked in the first two places with 2167 yuan and 2021 yuan respectively, while OPPO and millet ranked in the third and fourth places with 1979 yuan and 1502 yuan respectively. Xiaomi mobile phone is relatively cheap, but its market share is only second to Huawei. VIVO and OPPO, two domestic mobile phones, have always been called "factory girl machine" by the majority of users. However, through Li Yifeng, Peng Yuyan, Lu Han and other stars, fresh meat and various media channels, they have attracted numerous small fans to buy, which is also one of the mobile phone markets of Huawei Piece of Jiangshan.

3. Let's take a look at the price statistics of the four domestic mobile phones. Huawei MATE RS Porsche mobile phone ranks first with a price of 9406 yuan. It's exclusive to local tyrants. This price can't be shaken. But it's worth mentioning that the price of Huawei Mate 10 sold on Taobao has dropped to 3033 yuan! I remember that it took more than 4000 years to get started with mate 9 the year before last. As expected, the price of mobile phones has changed a lot over time. In terms of mobile phone sales, Xiaomi mobile 8 has the highest brand sales, and Taobao has a total sales volume of 770000 units (the price can be viewed through the graph linkage is 2352 yuan), which is still popular. But compared with the price of 3033 yuan under the high configuration of Huawei Mate 10, Xiaobian can't help but chop his hands. It's you!


Web crawler data capture, Python is second to none. But in terms of data statistics and visual display, FineBI, which is easy to operate and drag, is definitely the best choice for the entry of data analysis Xiaobai.

After Python completes the capture, analysis and storage of web page data, and in combination with the powerful data visualization and presentation ability of FineBI, it has successfully completed my statistical and analysis requirements for the four domestic mobile phone data of a certain treasure platform, with absolute conscience recommendation.

The above is a summary of some experience of Xiaobian running on the road of data visualization. Welcome to learn and exchange together.

Tags: Mobile Python MySQL Database

Posted on Wed, 06 Nov 2019 21:50:46 -0500 by venkyphp