Excel automation tutorial: integrating excel with Word through python to seamlessly generate automatic reports

1, Demand objectives

There is no doubt that Microsoft Excel and Word are the two most widely used software in business and non business circles. They are actually synonymous with the Word "work" itself. Usually, there is no week that doesn't let us ignite the combination of the two in one way or another and put their kindness into practice. Although generally speaking, automation is not required, sometimes automation is necessary. That is, when you want to generate a large number of charts, numbers, tables, and reports, it may become very boring if you choose a manual route. Well, it doesn't have to be like that. In fact, there is a way to create a pipeline in Python where you can seamlessly integrate the two, generate a spreadsheet in Excel, and then transfer the results to Word to generate reports almost immediately.

2, Implementation tools

1. Openpyxl

Get to know openpyxl, which is probably one of the most common bindings in Python, which makes the interface with Excel walk in the park. With it, you can read and write all current and legacy excel formats, namely xlsx and xls. Openpyxl allows you to populate rows and columns, execute formulas, create 2D and 3D charts, mark axes and titles, and a number of other functions that may come in handy. Most importantly, however, this package allows you to iterate over endless rows and columns in Excel, solving all the annoying number rattling and drawing you have to do before.

2. Python-docx

Then came Python docx, which is just like Openpyxl to Excel for Word. If you haven't studied their documentation yet, you probably should. Python docx is one of the simplest and most self-evident toolkits I've used since I started using Python itself. It allows you to automatically generate documents without any overhead by inserting text, filling out tables, and automatically rendering images into reports.
No more nonsense, let's create our own automation pipeline. Continue to start Anaconda (or any other IDE of your choice) and install the following packages:

pip install openpyxl
pip install python-docx

Microsoft Excel automation

Initially, we will load the created Excel Workbook (as shown below):

workbook = xl.load_workbook('Book1.xlsx')
sheet_1 = workbook['Sheet1']

Then, we will iterate over all the rows in the spreadsheet to calculate and insert the power value by multiplying the current by the voltage:

for row in range(2, sheet_1.max_row + 1):
    current = sheet_1.cell(row, 2)
    voltage = sheet_1.cell(row, 3)
    power = float(current.value) * float(voltage.value)
    power_cell = sheet_1.cell(row, 1)
    power_cell.value = power

After completion, we will use the calculated power value to generate a line graph, which will be inserted into the specified cell, as shown below:

values = Reference(sheet_1, min_row = 2, max_row = sheet_1.max_row, min_col = 1, max_col = 1)
chart = LineChart()
chart.y_axis.title = 'Power'
chart.x_axis.title = 'Index'
chart.add_data(values)
sheet_1.add_chart(chart, 'e2') 
workbook.save('Book1.xlsx')

Extract chart

Now that we have generated the chart, we need to extract it as an image so that we can use it in Word reports. First, we will declare the exact location of the Excel file and the location where the output chart image should be saved:

input_file = "C:/Users/.../Book1.xlsx"
output_image = "C:/Users/.../chart.png"

Then access the spreadsheet using the following methods:

operation = win32com.client.Dispatch("Excel.Application")
operation.Visible = 0
operation.DisplayAlerts = 0
workbook_2 = operation.Workbooks.Open(input_file)
sheet_2 = operation.Sheets(1)

You can then iterate over all the chart objects (if there are multiple) in the spreadsheet and save them in the specified location, as follows:

for x, chart in enumerate(sheet_2.Shapes):
    chart.Copy()
    image = ImageGrab.grabclipboard()
    image.save(output_image, 'png')
    pass
workbook_2.Close(True)
operation.Quit()

Microsoft Word automation

Now that we have generated the chart image, we must create a template document, which is basically a normal Microsoft Word document (. docx) completely formatted according to the appearance we want to report, including font, font size, format and page structure. Then, we just need to create placeholders for our automated content (that is, table values and images) and declare them with variable names, as shown below.


Any automated content can be declared in a pair of double curly braces {variable_name}}, including text and images. For a table, you need to create a table containing template rows of all columns, and then you need to attach a row above, add a row below, and use the following symbols:

first row

{%tr for item in variable_name %}

Last line

{%tr endfor %}

In the figure above, the variable name is

  • Python dictionary table_contents will store our table data
  • Dictionary key index (first column)
  • Power, current and voltage of dictionary values (second, third and fourth columns)
    Then, we import the template document into Python and create a dictionary to store the values of the table:
template = DocxTemplate('template.docx')
table_contents = []
for i in range(2, sheet_1.max_row + 1):
    table_contents.append({
        'Index': i-1,
        'Power': sheet_1.cell(i, 1).value,
        'Current': sheet_1.cell(i, 2).value,
        'Voltage': sheet_1.cell(i, 3).value
        })

Next, we will import the chart image previously generated by Excel and create another dictionary to instantiate all placeholder variables declared in the template document:

image = InlineImage(template,'chart.png',Cm(10))
context = {
    'title': 'Automated Report',
    'day': datetime.datetime.now().strftime('%d'),
    'month': datetime.datetime.now().strftime('%b'),
    'year': datetime.datetime.now().strftime('%Y'),
    'table_contents': table_contents,
    'image': image
    }

Finally, we will present the report using our value table and chart image:

template.render(context)
template.save('Automated_report.docx')

result

That's it, an automatically generated Microsoft Word report containing numbers and charts created in Microsoft Excel. With it, you have a fully automated pipeline for creating as many tables, charts, and documents as possible.

Project source code

import openpyxl as xl
from openpyxl.chart import LineChart, Reference

import win32com.client
import PIL
from PIL import ImageGrab, Image
import os
import sys

from docx.shared import Cm
from docxtpl import DocxTemplate, InlineImage
from docx.shared import Cm, Inches, Mm, Emu
import random
import datetime
import matplotlib.pyplot as plt


######## Generate automated excel workbook ########

workbook = xl.load_workbook('Book1.xlsx')
sheet_1 = workbook['Sheet1']
  
for row in range(2, sheet_1.max_row + 1):
    current = sheet_1.cell(row, 2)
    voltage = sheet_1.cell(row, 3)
    power = float(current.value) * float(voltage.value)
    power_cell = sheet_1.cell(row, 1)
    power_cell.value = power
  
values = Reference(sheet_1, min_row = 2, max_row = sheet_1.max_row, min_col = 1, max_col = 1)
chart = LineChart()
chart.y_axis.title = 'Power'
chart.x_axis.title = 'Index'
chart.add_data(values)
sheet_1.add_chart(chart, 'e2')
  
workbook.save('Book1.xlsx')


######## Extract chart image from Excel workbook ########

input_file = "C:/Users/.../Book1.xlsx"
output_image = "C:/Users/.../chart.png"

operation = win32com.client.Dispatch("Excel.Application")
operation.Visible = 0
operation.DisplayAlerts = 0
    
workbook_2 = operation.Workbooks.Open(input_file)
sheet_2 = operation.Sheets(1)
    
for x, chart in enumerate(sheet_2.Shapes):
    chart.Copy()
    image = ImageGrab.grabclipboard()
    image.save(output_image, 'png')
    pass

workbook_2.Close(True)
operation.Quit()


######## Generating automated word document ########

template = DocxTemplate('template.docx')

#Generate list of random values
table_contents = []

for i in range(2, sheet_1.max_row + 1):
    
    table_contents.append({
        'Index': i-1,
        'Power': sheet_1.cell(i, 1).value,
        'Current': sheet_1.cell(i, 2).value,
        'Voltage': sheet_1.cell(i, 3).value
        })

#Import saved figure
image = InlineImage(template,'chart.png',Cm(10))

#Declare template variables
context = {
    'title': 'Automated Report',
    'day': datetime.datetime.now().strftime('%d'),
    'month': datetime.datetime.now().strftime('%b'),
    'year': datetime.datetime.now().strftime('%Y'),
    'table_contents': table_contents,
    'image': image
    }

#Render automated report
template.render(context)
template.save('Automated_report.docx')

Tags: Python Excel

Posted on Tue, 02 Nov 2021 01:08:36 -0400 by raptoni