Pyecarts data visualization

Pyecarts is designed to interface with Python and facilitate the direct use of data to generate graphs in Python . Using pyecarts, you can generate independent web pages, or integrate them in flash and django

Pyecarts installation is simple:

pip install pyecharts

pyecharts_snapshot image export function:

pip install pyecharts_snapshot

Pyecarts is compatible with both Python 2 and python 3's Jupiter notebook environment. All charts can be displayed normally. The interactive experience consistent with the browser is simply not too powerful.

The three most commonly used charts in data analysis are column chart, line chart and scatter chart. Let's take a look at the examples of pyecharts drawing these three common charts.

1. Column chart

The column chart is suitable for showing the comparative relationship between several groups of data

from  pyecharts import Bar

x = ["shirt", "cardigan", "Chiffon shirt", "trousers", "high-heeled shoes"]
y1 = [5, 20, 36, 10, 75]
y2 = [10, 25, 8, 60, 20]

bar = Bar(title = "Monthly sales of products",width = 600,height = 420)
bar.add(name = "business A", x_axis = x, y_axis = y1)
bar.add(name = "business B", x_axis = x, y_axis = y2,is_xaxis_boundarygap =True)

# Export drawing html file, which can be opened directly with browser
bar.render('Column diagram demonstration.html')
bar

\

2. Line chart

The line graph is suitable for describing the functional relationship between two variables

from  pyecharts import Line

x = ['2018-{:0>2d}'.format(s) for s in range(1,13)]
y1 = [5,10,26,30,35,30,20,26,40,46,40,50]
y2 = [8,20,24,36,40,36,40,45,50,53,48,58]

line = Line(title = "Total monthly sales",width = 600,height = 420)

line.add(name = "business A", x_axis = x, y_axis = y1,
         line_width = 3,line_color = 'red')
line.add(name = "business B", x_axis = x, y_axis = y2,
         yaxis_min = 0,yaxis_max = 100,is_xaxis_boundarygap = False,
         is_datazoom_show =True,line_width = 2,line_color = 'cyan')

line.render('Line chart demonstration.html')
line

3. Scatter diagram

Scatter diagram is suitable to represent the distribution law of multiple attributes of a large number of samples.

from pyecharts import Scatter
import pandas as pd 

dfboy = pd.DataFrame()
dfboy['weight'] = [56,67,65,70,57,60,80,85,76,64]
dfboy['height'] = [162,170,168,172,168,172,180,176,178,170]

dfgirl = pd.DataFrame()
dfgirl['weight'] = [50,62,60,70,57,45,62,65,70,56]
dfgirl['height'] = [155,162,165,170,166,158,160,170,172,165]

scatter = Scatter(title = "Physical data",width = 600,height = 420)
scatter.add(name = "boy", x_axis = dfboy['weight'], y_axis = dfboy['height'])
scatter.add(name = "girl", x_axis = dfgirl['weight'], y_axis = dfgirl['height'],
           yaxis_min = 130,yaxis_max = 200,xaxis_min = 30,xaxis_max = 100)

scatter.render("Scatter diagram demonstration.html")
scatter

When the sample attribute dimension is more than 2, the scatter chart can use the color or size of points to express more attribute dimensions. The following example uses the size of the point to represent the third dimension.

from pyecharts import Scatter
import pandas as pd 

def custom_formatter(params):
    return (params.value[3] + ':' +
             str(params.value[0]) +','
             +str(params.value[1]) + ','
             +str(params.value[2]))

df = pd.DataFrame()
df['country'] = ["China",'U.S.A','Germany','France','britain','Japan','Russia','India','Australia','Canada']
df['life-expectancy'] = [76.9,79.1,81.1,81.9,81.4,83.5,73.13,66.8,81.8,81.7]
df['capita-gdp'] = [13334,53354,44053,37599,38225,36162,23038,5903,44056,43294]
df['population'] = [1376048943,321773631,80688545,64395345,64715810,126573481,143456918,
                    1311050527,23968973,35939927]

scatter = Scatter(title = "Development level of each country",width = 600,height = 420)
scatter.add(name = '',
            x_axis = df['capita-gdp'],  # params.values[0]
            y_axis = df['life-expectancy'], # params.values[1]
            extra_data = df['population'].values.tolist(), # params.values[2]
            extra_name = df['country'].values.tolist(), # params.values[3]
            tooltip_formatter=custom_formatter,  #Custom prompt box format content
            is_visualmap=True, 
            visual_orient="horizontal",
            visual_type = 'size',  #It can be size or color
            visual_dimension=2,
            visual_range=[20000000, 1500000000],
           )
scatter

4. Box diagram

Box chart is suitable for showing the statistical distribution law of a group of data. It can show the maximum, minimum, median, upper and lower quartiles of a group of data. The advanced version of box chart is violin chart, which can show the density estimation curve of data, and can be drawn with seaborn.

from pyecharts import Boxplot
x =['1 class','2 class','3 class','4 class']
y1=[78, 98, 56, 78, 90.0, 45, 78, 20, 87, 86, 74, 89, 94]
y2=[89, 82, 45, 67, 68, 78.0, 79, 98, 71, 56, 78, 81, 80]
y3=[90, 80, 60, 89, 76, 73.0, 72, 92, 89, 87, 65, 66, 76]
y4=[82, 72, 55, 100, 90.0, 78, 69, 67, 87, 66, 78, 71, 82]
box = Boxplot(title = 'Test result box chart',width = 600,height = 420)
# The maximum, minimum, median and upper and lower quartiles were calculated from the preprocessed data
y_prepared = box.prepare_data([y1,y2,y3,y4]) 
box.add(name = '',x_axis = x,y_axis = y_prepared)

Attachment: drawing violin map with seaborn

import seaborn as sns
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

#Set style
sns.set(style="white", context="notebook")
#Dealing with Chinese problems
sns.set_style({'font.sans-serif':['simhei', 'Arial']}) 

dfdata = pd.DataFrame()
dfdata['score'] = y1 + y2 + y3 + y4
dfdata['class'] = ['1 class']*len(y1)+['2 class']*len(y2)+['3 class']*len(y3)+['4 class']*len(y4)

ax = sns.violinplot(x= 'class', y = 'score',data = dfdata,
            palette = 'hls', # Settings palette
            inner = 'box'# Set internal display type → "box", "quartile", "point", "stick", None
           )

5. Word cloud picture

from pyecharts import WordCloud

words = ['python','jupyter','numpy','pandas','matplotlib','sklearn',
        'xgboost','lightGBM','simpy','keras','tensorflow',
         'hive','hadoop','spark']
counts = [100,90,65,95,50,60,70,70,20,70,80,80,60,60]

cloud = WordCloud(title = 'Common tools of data algorithm',width = 600,height = 420)
cloud.add(name = 'utils',attr = words,value = counts,
          shape = "circle",word_size_range = (10,70))
cloud

6. Geographic coordinate system map

The geographic coordinate system map is suitable for showing the data distribution law associated with countries, provinces, cities, longitude and latitude positions.

In pyecarts, Geo expresses the data associated with cities, and Map expresses the data associated with countries and provinces.

# Install map Accessory Package
!pip install echarts-countries-pypkg
!pip install echarts-china-provinces-pypkg
!pip install echarts-china-cities-pypkg
Example of National City Map
from pyecharts import Geo

data = [
    ("Haimen", 9),("erdos", 12),("Zhaoyuan", 12),("Zhoushan", 12),("Qiqihar", 14),("ynz ", 15),
    ("Huizhou", 37),("Jiangyin", 37),("Penglai", 37),("Shaoguan", 38),("Jiayuguan", 38),("Guangzhou", 38),
    ("Zhangjiagang", 52),("Sanmenxia", 53),("Jinzhou", 54),("Nanchang", 54),("city in Guangxi", 54),("Sanya", 54),
    ("Hohhot", 58),("Chengdu", 58),("da tong", 58),("Zhenjiang", 59),("Guilin", 59),("Zhangjiajie", 59),
    ("Beijing", 79),("Xuzhou", 79),("Hengshui", 80),("Baotou", 80),("Mianyang", 80),("Urumqi", 84),
    ("Heze", 194),("Hefei", 229),("Wuhan", 273),("Daqing", 279)]

geo = Geo(
    "Air quality in some cities in China",
    title_color="#fff",
    title_pos="center",
    width=800,
    height=600,
    background_color="#404a59",
)
attr, value = geo.cast(data)
geo.add(
    "",
    attr,
    value,
    visual_range=[0, 200],
    visual_text_color="#fff",
    symbol_size=15,
    is_visualmap=True,
)
geo
National Province Map
from pyecharts import Map
value = [155, 10, 66, 78, 44, 38, 88, 50, 20]
attr = ["Fujian","Shandong","Beijing","Shanghai","Jiangxi","Xinjiang","Inner Mongolia","Yunnan","Chongqing"]
m = Map("National Province Map", width=600, height=400)
m.add("", attr, value, maptype='china',
        is_visualmap=True, 
        is_piecewise=True,
        visual_text_color="#000",
        visual_range_text=["", ""],
        pieces=[
            {"max": 160, "min": 81, "label": "high"},
            {"max": 80, "min": 51, "label": "in"},
            {"max": 50, "min": 0, "label": "low"},
        ])
m
World map example
from pyecharts import Map
countries= ["China", "Canada", "India", "Russia", "United States","Japan"]
capita_gdp = [13334, 43294, 5903, 23038, 53354,36162]
population = [1376048943, 35939927, 1311050527, 143456918, 321773631,126573481]
life_expectancy = [76.9,81.7,66.8,73.13,79.1,73.13]

m = Map("World economic development level", width=800, height=500)
m.add(
    "per capita GDP",
    attr = countries,
    value = capita_gdp,
    maptype="world",
    is_visualmap=True,
    visual_range = [5000,60000],
    visual_text_color="#000",
    is_map_symbol_show=False,
    visual_orient="horizontal"
)
m

The above is the basic chart type. In general, this is a very powerful visualization library!

Posted on Mon, 22 Nov 2021 17:00:16 -0500 by Brian Swan