python deals with word and excel files

python processing word and excel table data

In the process of learning, we encounter the problem of filling excel table data into word in batches. We are not familiar with the use of Excel functions. python has a library for processing word and excel, so we learn to use python to realize the batch processing of data. First of all, we need to understand the specific operation of word and excel.

python to edit word table

The third-party library Python docx needs to be installed

Python docx API website: https://python-docx.readthedocs.io/en/latest/#api-documentation

Basic operations on tables

# -*- coding: utf-8 -*-
from docx import Document
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.oxml.ns import qn
from docx.shared import Pt

path=r'C:\Users\hxy\Desktop\test.docx'#File path
doc=Document(path)#Read in file
#Set table font
doc.styles['Normal'].font.name = u'Imitation of Song Dynasty'
doc.styles['Normal']._element.rPr.rFonts.set(qn('w:eastAsia'), u'Imitation of Song Dynasty')

#Get all tables and return to the list
tables=doc.tables
print(tables)

#Read the contents of the first table, get all the contents of the table and add a cycle
table_1=tables[0]#First table
tb_rows=table_1.rows#Get table rows
#Read each line s
for i in range(len(tb_rows)):
    row_data=[]
    row_cells=tb_rows[i].cells
    #Read cell contents of each row
    for cell in row_cells:
        #Cell content
        row_data.append(cell.text)
    print(row_data)
#There are two ways to fill data in the table
table_1.cell(0,1).text="Test use"#Cell direct assignment
table_1.cell(0,3).paragraphs[0].add_run(u'whoami')#Append write data
#Cell property modification
run=table_1.cell(0,5).paragraphs[0].add_run("computer")
run.bold=True#Bold font
run.font.size = Pt(9)  # Font size: small five-9
table_1.cell(0,5).paragraphs[0].alignment = WD_ALIGN_PARAGRAPH.CENTER  # Center

doc.save(path)#file save

python's reading and writing of excel

There are xlrd (can only read in, can't write), xlwd in python, and openpyxl module can read and write excel.

Both xlrd and xlwt are for Excel97-2003 operations, which are xls terminated files. It is obvious that the current versions are basically Excel2007 and above, with xlsx as the suffix. To operate this type of Excel file, openpyxl is used. The module can not only "read" or "write" operations, but also modify existing files.

xlrd website: https://pypi.org/project/xlrd/

Reading excel form by xlrd module

# -*- coding: utf-8 -*-

import xlrd
excelDir=r'C:\Users\hxy\Desktop\test.xlsx'
# Open table
workbook=xlrd.open_workbook(excelDir)
# Get table name
worksheet_name=workbook.sheet_names()
print(worksheet_name)
# Open table by sequence number
worksheet=workbook.sheet_by_index(0)
# Get table name, row, column
print(worksheet.name,worksheet.nrows,worksheet.ncols)
# Get the first line
rows=worksheet.row_values(0)
print(rows)
# Get cell value
data_1=worksheet.cell_value(1,7)
data_2=worksheet.cell(1,0).value
data_3=worksheet.row(1)[2].value
print(data_1,data_2,data_3)
# print(worksheet.cell(1,2))

#Read the table contents of the specified worksheet
for i in range(worksheet.nrows):
    row=worksheet.row_values(i)
    print(row)

openpyxl website: https://bitbucket.org/openpyxl/openpyxl/src/default/

Using openpyxl to read and write excel

import openpyxl

excelDir=r'C:\Users\hxy\Desktop\test.xlsx'
#Open workbook object
workbook=openpyxl.load_workbook(excelDir)
#Get all tables in Workbook
sheetnames=workbook.sheetnames
print(sheetnames)

#Get sheet object
# sheetname=workbook[sheetnames[0]]
sheetname=workbook.worksheets[0]
print(sheetname)

#Get table properties
name=sheetname.title
print(name)
rows,cols=sheetname.max_row,sheetname.max_column
print(rows,cols)

#Get data in a table by row or column
for row in sheetname.rows:
    for cell in row:
        print(cell.value,end=" ")
    print()
print("=="*30)
for col in sheetname.columns:
    for cell in col:
        print(cell.value,end=" ")
    print()
#Get data of specified units
data1=sheetname.cell_value(0,0)
data2=sheetname.cell(0,1).value
print(data1,data2)


# Insert a column
sheetname.insert_cols(1)
for index,row in enumerate(sheetname.rows):
    if index==0:
        row[0].value="number"
    else:
        row[0].value=index

#Can insert list data
data=[10,"llc",'whoami','hellowordl']
sheetname.append(data)
#preservation
workbook.save(filename=excelDir)
print('finish')

Batch operation data

Because the data format is the same, but the data filled in is different, you can use the basic operations of docx and xlrd on the table to realize the batch processing of data. First, use xlrd to read the data into the list, and then fill the data of the list into the specified table.

Here is only the basic operation of word and excel, which can complete the current needs. In addition, more operations such as setting cell format and data format need to be learned.

Tags: Python Excel

Posted on Sun, 17 May 2020 11:56:15 -0400 by quiksilver