The nearest Hongxing Erke, frequently hot search!
Today, let's use Python crawler to see how many Hongxing Erke stores there are in China?
It's time to buy clothes in autumn and winter. You can consider supporting a wave. I like this article's praise support, and the technical exchange group is provided at the end of the article.
requirement analysis
First, we open the map and search for "Hongxing Erke":
F12 open the browser developer mode and find the following link.
Copy the link to the browser and find that it is a data set in json format. The number of provinces and cities we need and the corresponding number of cities are among them.

Send request
We first simulate the browser to send a request to obtain the json data set, and then obtain the hongxingerke stores and their corresponding quantities in each city.
url = 'https://map.baidu.com/?newmap=1&reqflag=pcmap&biz=1&from=webmap&da_par=baidu&pcevaname=pc4.1&qt=s&da_src=searchBox.button&wd=%E9%B8%BF%E6%98%9F%E5%B0%94%E5%85%8B&c=1&src=0&wd2=&pn=0&sug=0&l=5&b=(7854419.220000001,831323.8799999999;15358291.22,8507227.879999999)&from=webmap&biz_forward={%22scaler%22:1,%22styles%22:%22pl%22}&sug_forward=&auth=yER4N%40Rwcw0cBSVCeS%3DdQBAfLdF6agFfuxLzNBVHVHRtxZhQxjh%40wWvvYgP1PcGCgYvjPuVtvYgPMGvgWv%40uVtvYgPPxRYuVtvYgP%40vYZcvWPCuVtvYgP%40ZPcPPuVtvYgPhPPyheuVtcvY1SGpuRtDpnSCE%40%40By1uVtCGYuVt1GgvPUDZYOYIZuVt1cv3uVtGccZcuVtPWv3GuBtR9KxXwPYIUvhgMZSguxzBEHLNRTVtcEWe1GD8zv7u%40ZPuVtc3CuVteuEthjzgjyBODQEYHUHBxfiKKvMuxcc%40AJ&seckey=cde6ebb241c3d75c675c8688828640edba33c570fc006f6ccdee864f2e95d88033fc19e794fee19c2417a6953ba260f3e91efa7e82cbc9c45b5854aec79ce924b08cce22526301f3a8c80710ebb635e73f5eccb560ee1dc38add2dfc793843279646449563fa4547850c144c3838de6fb1efaab7253aa6e99c1de56b4ddbad3905f480e4d46e5414c519465f08bedee98acac8fc7d2f84f413b041287538b09a811ee347b66a4c2c948f2ffa2f6e7674e0c5cb2b6407b610181af9064f870280fd7053482a91caa7cb762068ea41c4bb7bd2f7899f81a2ba5ab3fde28503a6fdc54b0fdee52cc2d02da76e1a4f1b4745&device_ratio=1&tn=B_NORMAL_MAP&nn=0&ie=utf-8&t=1627305062813' headers = { 'Cookie': 'BIDUPSID=5FDDBE7E96E9CA6D71998093E123403A; PSTM=1627225875; BAIDUID=F934E08738623DF508F108DEF391CFB9:FG=1; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; BCLID_BFESS=8512773460870798959; BDSFRCVID_BFESS=5UPOJeC62l07libepqHRKmSPxe5rbsOTH6aoyt6boQjiS8lguPwkEG0PHf8g0Ku-S2EqogKKy2OTH9DF_2uxOjjg8UtVJeC6EG0Ptf8g0M5; H_BDCLCKID_SF_BFESS=tJk8_DPbJK-3fP36q4cBb-4WhmT22-us3g7W2hcH0b61EnR_XRQcbJ8LQ-Qi2lJTMITiaKJjBMb1DbRMLfjN5TODKf-DKb3pWDTm_q5TtUJMeCnTDMRh-l04XNbyKMnitIv9-pPKWhQrh459XP68bTkA5bjZKxtq3mkjbPbDfn028DKuDj-WDjJ0DGRf-b-X-I6b0nRH-njfebRNq4nKbICShG4tLlO9WDTm_DostI3SjJoNKbQ10xPD0n3OK6QHKj79-pPKKR7BfKQPhpQ8MqJbhMJtQnbW3mkjbpnDfn02OPKz0T5pKt4syPR8JfRnWn5RKfA-b4ncjRcTehoM3xI8LNj405OTbIFO0KJzJCcjqR8ZDTuBj55P; __yjs_duid=1_695635cb727c238e28cd4254a28a7a0e1627258379781; BAIDUID_BFESS=F934E08738623DF508F108DEF391CFB9:FG=1; __yjs_st=2_NDRiODllYWQzMjBiMzFhYTlmYWVjZTE4NjFkZTM5MmMwODhlZDE0MjVkYWVmMjIzMzc3MWI2Y2RlOTNkMWJkNDBhNmE2YTIyMTJlZjg0ODJiNzk0NDY2NTYxY2NkOGY5YjM5ODViMDAyZjAwY2E0MThjODUyMGM0N2JiMmEyZGEyMTA4ODdkNjViYjcwNDEwODhjNDkzNDg4YjQyMWNjYTI4ZjAzZDllYTg3YjE3ZDRiYWNlMmJkMzc3YjE1OGU5NWU4NjM3YWQxMjkwNDVkMmMyZTM1YTQ5ODgxNTA4ZjE3MDk2YTYwODg5MmY5ZTZlMmYxZGQ5ZTU1OTdkZGYxZV83X2VhYjhlOWZi; H_PS_PSSID=34300_34100_33969_34272_31254_33848_34282_26350_22158; delPer=0; PSINO=3; BA_HECTOR=002h218g2ka58g0lhq1gftcs10r; ab_sr=1.0.1_ZWRlNDJiMzk0ZWQ3YzZmYzgxMmQzOTIyZDBlN2FjZTIxNjIzODliZWE4MzZjZGEwZTBiMTIzNGRmNDhiYmM2NTJhZjI0ZjBkNTFlMjg4MWYxYmY3ZDMzMGVkNmQ1NTNhMDVkN2I1ZGViMDY2ZjBlNWJmOTk4NTBhZGIwOGU4OTg5YzNiM2QwZjVhMTFkYmQ0ODU2NTJkYzNkZmI0ZjI1MA==; PMS_JT=%28%7B%22s%22%3A1627305057015%2C%22r%22%3A%22https%3A//map.baidu.com/@11606355.22%2C4669275.88%2C5.4z%22%7D%29', 'Referer': 'https://map.baidu.com/search/%E9%B8%BF%E6%98%9F%E5%B0%94%E5%85%8B/@11606355.22,4669275.88,5z?querytype=s&da_src=shareurl&wd=%E9%B8%BF%E6%98%9F%E5%B0%94%E5%85%8B&c=1&src=0&pn=0&sug=0&l=5&b=(6569474.192744261,1360353.0162781863;12256345.744431017,7177600.4441499)&from=webmap&biz_forward=%7B%22scaler%22:1,%22styles%22:%22pl%22%7D&seckey=cde6ebb241c3d75c675c8688828640edba33c570fc006f6ccdee864f2e95d88033fc19e794fee19c2417a6953ba260f3e91efa7e82cbc9c45b5854aec79ce924b08cce22526301f3a8c80710ebb635e73f5eccb560ee1dc38add2dfc793843279646449563fa4547850c144c3838de6fb1efaab7253aa6e99c1de56b4ddbad3905f480e4d46e5414c519465f08bedee98acac8fc7d2f84f413b041287538b09a811ee347b66a4c2c948f2ffa2f6e7674e0c5cb2b6407b610181af9064f870280fd7053482a91caa7cb762068ea41c4bb7bd2f7899f81a2ba5ab3fde28503a6fdc54b0fdee52cc2d02da76e1a4f1b4745&device_ratio=1', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4573.0 Safari/537.36' } resp = requests.get(url, headers = headers) if resp.status_code == requests.codes.ok: print(resp.json())
Obtain the corresponding information as follows:
Next, let's get each province and the corresponding quantity, because our country has 23 provinces and municipalities directly under the central government, so we need to get it step by step
China has 34 provincial-level administrative regions, including 23 provinces, 5 autonomous regions, 4 municipalities directly under the central government and 2 special administrative regions. The 23 provinces are: Hebei, Shanxi, Liaoning, Jilin, Heilongjiang, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, Shandong, Henan, Hubei, Hunan, Guangdong, Hainan, Sichuan, Guizhou, Yunnan, Shaanxi, Gansu, Qinghai and Taiwan.
The five autonomous regions are Inner Mongolia Autonomous Region, Guangxi Zhuang Autonomous Region, Tibet Autonomous Region, Ningxia Hui Autonomous Region and Xinjiang Uygur Autonomous Region. The four municipalities directly under the central government are Beijing, Tianjin, Shanghai and Chongqing. The two special administrative regions are: Hong Kong Special Administrative Region and Macao Special Administrative Region.
The location of the four municipalities directly under the central government is different from that of other cities, and the distribution is as follows:
We will save the obtained information in Excel of the province with the following code:
prov = [] value = [] # Get four municipalities hot_city = datas.json()['hot_city'] for i in hot_city: pv = i.split('|') if 'Beijing' in pv[0]: prov.append(pv[0]) value.append(pv[1]) if 'Shanghai' in pv[0]: prov.append(pv[0]) value.append(pv[1]) if 'Tianjin' in pv[0]: prov.append(pv[0]) value.append(pv[1]) if 'Chongqing City' in pv[0]: prov.append(pv[0]) value.append(pv[1]) # Print out all province information city_list = datas.json()['more_city'] for item in city_list: # Get the province of hongxingerke province = item['province'] prov.append(province) # Get the number of provinces where hongxingerke is located prov_num = item['num'] value.append(prov_num) pd_data = pd.DataFrame({ 'province': prov, 'quantity': value, }) pd_data.to_excel('province.xlsx') ic('Province information printing completed!')
Excel stores the following data:
Similarly, we can obtain the number of honghongxing Erke stores in specific cities in each province
All city information consists of popular cities + more cities
city = [] value = [] # Get four municipalities hot_city = datas.json()['hot_city'] for i in hot_city: pv = i.split('|') if 'Guangzhou City' in pv[0]: city.append(pv[0]) value.append(pv[1]) if 'Chengdu' in pv[0]: city.append(pv[0]) value.append(pv[1]) if 'Nanjing City' in pv[0]: city.append(pv[0]) value.append(pv[1]) if 'Hangzhou' in pv[0]: city.append(pv[0]) value.append(pv[1]) if 'Wuhan' in pv[0]: city.append(pv[0]) value.append(pv[1]) if 'Shenzhen City' in pv[0]: city.append(pv[0]) value.append(pv[1]) # Print out all city information city_list = datas.json()['more_city'] for item in city_list: cities = item['city'] for i in cities: # Get the urban area of hongxingerke Province cit = i['name'] city.append(cit) # Obtain the corresponding quantity of urban areas in the province where hongxingerke is located city_num = i['num'] value.append(city_num) pd_data = pd.DataFrame({ 'city': city, 'quantity': value, }) pd_data.to_excel('city.xlsx') ic('City information printing completed!')
Excel stores city data as follows:
data processing
We first use Pandas to read and clean the data
The main thing is to remove the words "province" and "autonomous region" after provinces
# read file pd_data = pd.read_excel('province.xlsx') prov = pd_data['province'].tolist() prov_num = pd_data['quantity'].tolist() name = [] for i in prov: if "province" in i: name.append(i.replace('province', '')) elif 'Inner Mongolia Autonomous Region' in i: name.append(i.replace('Autonomous Region', '')) else: name.append(i[:2]) ic(name) ic(prov) ''' 2021-07-27 20:50:50.752477|name: ['Beijing','Shanghai','Tianjin','Chongqing','Guangdong','Zhejiang','Shandong','Jiangsu','Hebei','Anhui','Hunan','Sichuan','Fujian','Henan','Inner Mongolia','Shanxi','Guangxi','Guizhou','Heilongjiang','Hubei','Yunnan','Gansu','Liaoning','Shaanxi','Jiangxi','Jilin','Shanghai','Xinjiang','Tianjin','Ningxia','Hainan','Tibet','Qinghai'] 2021-07-27 20:50:50.752477|prov: ['Beijing','Shanghai','Tianjin','Chongqing City','Guangdong Province','Zhejiang Province','Shandong Province','Jiangsu Province','Hebei Province','Anhui Province','Hunan Province','Sichuan Province','Fujian Province','Henan Province','Inner Mongolia Autonomous Region','Shanxi Province','Guangxi Zhuang Autonomous Region','Guizhou Province','Heilongjiang Province','Hubei province','Yunnan Province','Gansu Province','Liaoning Province','Shaanxi Province','Jiangxi Province','Jilin Province','Shanghai','Xinjiang Uygur Autonomous Region','Tianjin','Ningxia Hui Autonomous Region','Hainan ','Tibet Autonomous Region','Qinghai Province'] '''
Next, we use pyecarts to visualize our cleaned data
map = ( Map() .add("Quantity distribution", [list(z) for z in zip(prov, prov_num)], "china") .set_global_opts( title_opts=opts.TitleOpts(title="Distribution map of hongxingerke stores nationwide"), visualmap_opts=opts.VisualMapOpts(max_=500, is_piecewise=True), ) ) map.render('province.shtml') ic('The provincial distribution map has been drawn!')
The renderings are as follows:
The same is true for each city map where the province is located. We take Guangdong, which has the most stores, as an example. You can also choose any province
Grab data - > store data - > process data - > visualize data. The final effect is as follows:
After data visualization, it is clear at a glance, which is more pleasing to the eye than looking at Excel. And get twice the result with half the effort.
summary
1. This paper introduces in detail how to use python to capture a certain degree of map information and store the processing data to the final visualization
Interested readers can try to practice by themselves.
2. This article is only for readers to learn and use, not for other purposes!
3. If you like this article, just like it in the.
Original text: https://mp.weixin.qq.com/s/_cCY-aNtKLuEDWqKzbOAhQ
Technical exchange
Welcome to reprint, collect, gain, praise and support!
At present, a technical exchange group has been opened, with more than 2000 group friends. The best way to add notes is: source + Interest direction, which is convenient to find like-minded friends
- Method ① send the following pictures to wechat, long press identification, and the background replies: add group;
- Mode ②. Add micro signal: dkl88191, remarks: from CSDN
- WeChat search official account: Python learning and data mining, background reply: add group