Python data analysis - drawing-2-Seaborn advanced drawing-2-diagram

1, Scatter plot: scatterplot

Function: seaborn.scatterplot

Common parameters:

x,yarray, str, series, input variables. The string should be the corresponding variable name in data. Using series will display the name on the axis.
dataReceives a DataFrame representing the dataset used for drawing.
hueReceive the variable name in data and pass in the classification variable to classify by color.
sizeReceive the variable name in data and pass in the classification variable to mark the size classification.
sizesReceive list, dict and tuple to determine the size of different levels. You can map one by one or set the maximum and minimum range.
styleReceive the variable name in data and pass in the classification variable to mark the shape classification.
markersbool, list, dict, determine different levels of style.
alphafloat, "auto", the transparency of the point. The default is "auto"
legend“brief”,“full”,False. The drawing form of graphic legend is "brief" by default
palettePalette to change the default drawing color.

1. Basic mapping of two variables

import seaborn as sns
from matplotlib import pyplot as plt
tips=sns.load_dataset('tips')
#Note: there is an error in downloading the dataset here because the request was rejected by the server. We can go to the official dataset website https://github.com/mwaskom/seaborn-data Download the dataset to the Seaborn data folder and execute the statement.

#Make the graphics display Chinese normally
plt.rcParams['font.sans-serif']='SimHei'
plt.rcParams['axes.unicode_minus']=False
#Observation data
tips.head()
>

  total_bill tip	sex	 smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3
3	23.68	3.31	Male	No	Sun	Dinner	2
4	24.59	3.61	Female	No	Sun	Dinner	4

#Drawing
ax=sns.scatterplot(x='total_bill',y='tip',data=tips)
ax.set_title('General ledger sheet and tip scatter chart')

  2. Classify variables by adding a third variable

(1) Shade points and change markers to display grouped variables

sns.scatterplot(x='total_bill',y='tip',hue='time',style='time',data=tips)

  (2) Quantitative category variables are displayed by changing the size of points and adding continuous colors

sns.scatterplot(x='total_bill',y='tip',size='size',hue='size',data=tips)

(3) Use custom group point markers

markers={'Lunch':"o","Dinner":"X"}
sns.scatterplot(x="total_bill",y="tip",style="time",hue="time",markers=markers,data=tips,palette='Set2')

 

  (4) You can also pass the matplotlib parameter to the scatterplot function to control the drawing elements

sns.scatterplot(x="total_bill",y="tip",data=tips,color='red')
plt.title('General ledger sheet and tip scatter chart')

2, Line chart: lineplot  

Function: seaborn.lineplot

Common parameters:

x,y
data
dashesbool, list, dict to determine different levels of style
estimateReceive pandas methods or callable functions and aggregate methods y at the same x level.
ciint, "sd", None, indicates the size of the confidence interval aggregated using estimate, and "sd" indicates the standard deviation of the data.
n_bootIndicates the number of confidence intervals calculated.
sortbool, which means sorting according to the x and y variables or the order of occurrence.
err_style

"Band" and "bars" indicate whether to draw the confidence interval using translucent error band or discrete error bar. The default is "band"“

err_band

dict, used to control the parameters of the error line.

Use fmri data set to draw a line chart with color and line style to display classification variables.

sns.lineplot(x="timepoint",y="signal",hue="event",style="event",data=fmri)

  Mark the break of the line chart with points:

sns.lineplot(x="timepoint",y="signal",hue="event",style="event",markers=True,data=fmri)

Modify the display form of error band and the size of confidence interval

sns.lineplot(x="timepoint",y="signal",hue="event",style="event",err_style="bars",ci=80,data=fmri)

 

Use the dots dataset to draw a line chart, set the sizes parameter, and change the line width of the size input variable.

dots=sns.load_dataset('dots')
dots.head()
>
   align choice time coherence	firing_rate
0	dots	T1	-80	 0.0	     33.189967
1	dots	T1	-80	 3.2	     31.691726
2	dots	T1	-80	 6.4	     34.279840
3	dots	T1	-80	 12.8	     32.631874
4	dots	T1	-80	 25.6	     35.060487

ax1=sns.lineplot(x="time",y="firing_rate",size="coherence",hue="choice",style="align",data=dots,palette="Set1")
ax1.set_title("Default lineweight")

 

  Custom lineweight:

ax2=sns.lineplot(x="time",y="firing_rate",size="coherence",sizes=(0.5,1.5),hue="choice",style="align",data=dots,palette="Set1")
ax2.set_title("Custom lineweight")

  3, Faceted drawing: relplot

It can access scatterplot and lineplot to draw the relationship diagram of multi graph grid at the same time.

Function: seaborn.relplot

Common parameters:

x,y

data

row,colReceive the variable name in data and pass in the classification variable to determine the facet of the grid graph.
row_order,col_orderReceive the list, and pass in the category name list of classification variables in this order.
kindReceive "scatter" and "line" and select the drawing function. The default is scatter
heightReceive scalar, indicating the height of the grid diagram. The default value is 5
aspectThe width of the grid chart. The default value is 1
facet_kwsReceive dict, indicating other parameters passed to FacetGrid. The default is "auto"“

Using tips data set, draw a single facet scatter diagram first:

sns.set(style='ticks')
sns.relplot(x="total_bill",y="tip",hue="day",data=tips)

  The grid graph can be drawn by passing in the classification variables smoker and time to row and col.

sns.set(style='ticks')
sns.relplot(x="total_bill",y="tip",hue="sex",row="time",col="smoker",data=tips)

The col_wrap parameter controls the number of columns:

sns.set(style='ticks')
sns.relplot(x="total_bill",y="tip",hue="sex",col="smoker",col_wrap=1,data=tips)

 

  Draw a grid line chart using fmri dataset:

sns.relplot(x="timepoint",y="signal",col="event",data=fmri,kind="line")

 

Tags: Python Data Analysis data visualization

Posted on Fri, 03 Dec 2021 02:08:24 -0500 by dotbands