Make visual Kanban of sales data with Python to display and analyze in one step!

Hello, I'm little F ~

In the data age, the importance of sales data analysis is needless to say.

Only through accurate analysis of sales data can we find out the causes of data changes (growth or decline).

Then it will be possible to solve problems and find new growth points!

Today, I'd like to introduce a method of making a large sales data screen in Python.

It is mainly built using Python's Streamlit library, plot library and Pandas library.

Pandas processes data, Plotly makes visual charts, and Streamlit builds visual pages.

For the above three libraries, Streamlit library may be unfamiliar to you. Let me briefly introduce it.

Streamlit is a completely free open source application framework. It can help you quickly create a cool Web page without understanding complex HTML, CSS and other front-end technologies.

Document address

https://docs.streamlit.io/

Specifically, you can go to see the document to learn and practice more.

Now let's explain how to build a sales data Kanban ~

01. Data

The data used is fictitious data. There are 1000 orders in 2021 sales order data of a supermarket.

There are three cities, namely Beijing, Shanghai and Hangzhou. There are two types of customers, members and ordinary customers. Customers are male and female.

The rest also includes order number, commodity type, unit price, quantity, total price, date, time, payment method, cost, gross profit margin, total revenue, score and other information.

Universal Pandas read_excel method to read data.

Skip the first three rows, select columns B to R, and 1000 rows of data.

def get_data_from_excel():
    df = pd.read_excel(
        io="supermarkt_sales.xlsx",
        engine="openpyxl",
        sheet_name="Sales",
        skiprows=3,
        usecols="B:R",
        nrows=1000,
    )
    # Add hour column data
    df["hour"] = pd.to_datetime(df["time"], format="%H:%M:%S").dt.hour
    return df

df = get_data_from_excel()
print(df)

The data was successfully read and the results are as follows.

Now you can write the page.

02. Page title and Icon

We all know that when a browser opens a web page, there will be a title and icon.

Therefore, we need to set the name, icon and layout of this web page first.

This is also the first Streamlit command used to build a page using Streamlit, and can only be set once.

# Set Web page information 
st.set_page_config(page_title="Sales data screen", page_icon=":bar_chart:", layout="wide")

Where page_ The icon parameter can use emoticon codes to display icons.

The short code can be obtained from the link below.

https://www.webfx.com/tools/emoji-cheat-sheet/

Proper emoticon code!

03. Sidebar and multi selection box

St.sidebar (sidebar), each element passed to st.sidebar will be fixed on the left, so that users can focus on the content of the home page.

Multiselect is an interactive component through which data can be filtered.

# sidebar 
st.sidebar.header("Please filter here:")
city = st.sidebar.multiselect(
    "Select city:",
    options=df["city"].unique(),
    default=df["city"].unique()
)

customer_type = st.sidebar.multiselect(
    "Select customer type:",
    options=df["Customer type"].unique(),
    default=df["Customer type"].unique(),
)

gender = st.sidebar.multiselect(
    "Choose gender:",
    options=df["Gender"].unique(),
    default=df["Gender"].unique()
)

df_selection = df.query(
    "city == @city & Customer type ==@customer_type & Gender == @gender"
)

Combined with the query of Pandas, the data can be filtered.

Through the above code, it is built successfully, as shown on the left side of the figure below.

Click the close symbol in the upper right corner of the sidebar to hide the sidebar.

The page will display the main page.

04. Main page information

Next, write the main page information, including the main page title, total sales, average score and average sales information.

Like the icons of web pages, it is realized through emoticon code.

# Main page
st.title(":bar_chart: Sales data screen")
st.markdown("##")

# Core indicators, total sales, average score, star rating and average sales data
total_sales = int(df_selection["Total price"].sum())
average_rating = round(df_selection["score"].mean(), 1)
star_rating = ":star:" * int(round(average_rating, 0))
average_sale_by_transaction = round(df_selection["Total price"].mean(), 2)


# 3-column layout
left_column, middle_column, right_column = st.columns(3)

# Add relevant information
with left_column:
    st.subheader("Total sales:")
    st.subheader(f"RMB {total_sales:,}")
with middle_column:
    st.subheader("Average score:")
    st.subheader(f"{average_rating} {star_rating}")
with right_column:
    st.subheader("Average sales:")
    st.subheader(f"RMB {average_sale_by_transaction}")

# Separator
st.markdown("""---""")

Complete the processing of core index data and display it in layout.

05. Main page chart

It contains two charts, one is the hourly sales and the other is the total sales of various commodities. Complete the drawing of the chart through plot express.

Plotly Express is a new high-level Python visualization library. It is a high-level package of Plotly.py. It provides a simple syntax for complex diagrams.

Document address

https://plot.ly/python/plotly-express/

Inspired by Seaborn and ggplot2, it is specially designed to have a concise, consistent and easy to learn API. With just one import, you can create rich interactive drawings in a function call.

# Sales of various commodities (histogram)
sales_by_product_line = (
    df_selection.groupby(by=["Commodity type"]).sum()[["Total price"]].sort_values(by="Total price")
)
fig_product_sales = px.bar(
    sales_by_product_line,
    x="Total price",
    y=sales_by_product_line.index,
    orientation="h",
    title="<b>Total sales of each commodity</b>",
    color_discrete_sequence=["#0083B8"] * len(sales_by_product_line),
    template="plotly_white",
)
fig_product_sales.update_layout(
    plot_bgcolor="rgba(0,0,0,0)",
    xaxis=(dict(showgrid=False))
)

# Hourly sales (histogram)
sales_by_hour = df_selection.groupby(by=["hour"]).sum()[["Total price"]]
print(sales_by_hour.index)
fig_hourly_sales = px.bar(
    sales_by_hour,
    x=sales_by_hour.index,
    y="Total price",
    title="<b>Total sales per hour</b>",
    color_discrete_sequence=["#0083B8"] * len(sales_by_hour),
    template="plotly_white",
)
fig_hourly_sales.update_layout(
    xaxis=dict(tickmode="linear"),
    plot_bgcolor="rgba(0,0,0,0)",
    yaxis=(dict(showgrid=False)),
)


left_column, right_column = st.columns(2)
left_column.plotly_chart(fig_hourly_sales, use_container_width=True)
right_column.plotly_chart(fig_product_sales, use_container_width=True)

Add data, set chart configuration, and web page layout.

The results are as follows.

06. Hidden parts

When we build an interface through Streamlit, there will be a red line, menu and "Make with Streamlit" at the end by default.

For the sake of beauty, they can be hidden here.

# Hide streamlit default format information
hide_st_style = """
            <style>
            #MainMenu {visibility: hidden;}
            footer {visibility: hidden;}
            header {visibility: hidden;}
            </style>
            """

st.markdown(hide_st_style, unsafe_allow_html=True)

Such an interactive sales data Kanban is completed!

# Install dependent Libraries
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple plotly==4.14.3
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pandas==1.1.0
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple streamlit==0.86.0
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple openpyxl==3.0.6

# function
streamlit run app.py

Install the relevant dependencies and run the program from the command line terminal.

Posted on Tue, 23 Nov 2021 04:58:31 -0500 by Techissue2008