python basic 07 file operation

1 Introduction

1.1 text and binary files

Text file:
Ordinary "character" text is stored, and python defaults to unicode character set (two bytes represent one character, up to 65536), which can be opened by Notepad program. However, documents edited by word software are not text files.
Binary file
Binary files store the data content in "bytes" and cannot be opened with Notepad. Special software must be used
decode. Common are: MP4 video files, MP3 audio files, JPG pictures, doc documents, etc

1.2 modules related to document operation

2 file operation

2.1 operation of text file

The steps are:
1 create file object
2 write data
3 close file object

2.1.1 create file object

open(file name[,Open mode])

The opening methods are as follows:

Creation of text file objects and binary file objects:
If we do not add the mode "b", we will create a text file object by default, and the basic unit of processing is "character". In case of binary mode "b", a binary file object is created, and the basic unit of processing is "byte".

2.1.2 write data

Relationship between common codes:

Chinese garbled Code:
The default code of windows operating system is GBK, and the default code of Linux operating system is UTF-8. When we use open(), we call the file opened by the operating system, and the default code is GBK

Solve the problem of Chinese garbled code by specifying the file code

f = open((r"b.txt","w",encoding="utf-8")
f.write("Shang Xuetang\n Baizhan programmer\n")
f.close()

write()/writelines() writes data
write(a): writes the string a to a file
writelines(b): writes a list of strings to a file without adding line breaks

s = ["Gao Qi\n","Gao Laosan\n","Gao Laosi\n"]
f.writelines(s)

2.1.3 close() closes the file stream

Since the underlying file is controlled by the operating system, the file object we open must explicitly call the close() method to close the file object. When the close() method is called, the buffer data will be written to the file first (or the flush() method can be called directly), and then the file will be closed to release the file object.

In order to ensure that the open file object is closed normally, it is generally implemented in combination with the finally or with keyword of the exception mechanism
Open file objects can be closed in any case.

finally close:
try:
	f = open(r'my.txt','a')
	str = 'xxx'
	f.write(str)
except BaseException as e:
	print(e)
finally:
	f.close()

The with keyword (context manager) can automatically manage context resources. No matter what reason you jump out of the with block, it can ensure that the file is closed correctly, and can automatically restore the scene when you enter the code block after the code block is executed

with close
s = ["Gao Qi\n","Gao Laosan\n","Gao Laowu\n"]
with open(r"d:\bb.txt","w") as f:
	f.writelines(s)

2.1.4 file reading

There are generally several methods:
1,read([size])
Read size characters from the file and return them as results. If there is no size parameter, the entire file is read.
Reading to the end of the file returns an empty string
2,readline()
Read a line and return it as a result. Reading to the end of the file returns an empty string
3,readlines()
In the text file, each line is stored in the list as a string, and the list is returned

[Operation] the file is small, and the file content is read into the program at one time
with open(r"d:\bb.txt","r") as f:
	print(f.read())
[Operation] read a file by line
with open(r"bb.txt","r") as f:
	while True:
		fragment = f.readline()
		if not fragment: #If it is empty, the loop will jump out
			break
		else:
			print(fragment,end="")
[Operation] use the iterator (return one line at a time) to read the text file
with open(r"d:\bb.txt","r") as f:
	for a in f:
		print(a,end="")

2.1.5 exercise: add a line number to the end of each line of the text file

My approach:
with open(r'123.txt','r+') as f:
    txt = []
    for i in f:
        txt.append(i)
        print(i,end='')
for i in range(len(txt)):
    txt[i] = txt[i][:-1]+' #'+ str(i) +txt[i][-1:]
with open(r'123.txt','w') as f:
    f.writelines(txt)

Reference answer:

with open(r'123.txt','r+') as f:
    lines = f.readlines()
a = enumerate(lines,start = 1)
lines = ['#' + str(index) + line.rstrip() +'\n' for index,line in a] #rstrip() removes spaces by default
with open(r'123.txt','w') as f:
    f.writelines(lines)

2.2 binary file operation

f = open(r'a.jpg','wb') #Writable, overridden binary object
f = open(r"a.jpg", 'ab') #Writable, append mode binary object
f = open(r"a.jpg", 'rb') #Readable binary object

copy picture

with open('a.gif','rb') as f:
	with open('b.gif','wb') as w:
		for line in f.readlines():
			w.write(line)

2.3 common attributes and methods of file objects

attributemethod
nameReturns the name of the file
modeReturns the open mode of the file
closedReturns True if the file is closed
Method nameexplain
read([size])Read the contents of size bytes or characters from the file and return. If [size] is omitted, it will be read to the end of the file, that is, all contents of the file will be read at one time
readline()Read a line from a text file
readlines()Each line in the text file is treated as an independent string object, and these objects are returned in the list
write(str)Writes the string str contents to a file
writelines(s)Writes the string list s to the file without adding line breaks
seek(offset [,whence])Move the file pointer to the new position, and offset represents the offset of how many bytes relative to where; Offset: off is positive to the end direction and negative to the start direction; Different values of where represent different meanings: 0: calculate from the file header (default), 1: calculate from the current location, and 2: calculate from the end of the file
tell()Returns the current position of the file pointer
truncate([size])No matter where the pointer is, only the first size bytes of the pointer are left, and the rest are deleted; If no size is passed in, all contents will be deleted when the pointer reaches the end of the file
flush()Writes the contents of the buffer to the file without closing the file
close()Write the contents of the buffer to the file, close the file at the same time, and release the resources related to the file object
with open("e.txt","r",encoding="utf-8") as f:
	print("The file name is:{0}".format(f.name))
	print(f.tell())
	print("Read content:{0}".format(str(f.readline())))
	print(f.tell())
	f.seek(0,0)
	print("Read content:{0}".format(str(f.readline())))

Note: in a text file, if the file is not opened with the b mode option, it is only allowed to calculate the relative position from the file header. An exception will be thrown when calculating from the end of the file, such as seek(-2,2). The error content can't do non zero end relative searches. It is necessary to change 'r +' to rb

2.4 using pickle serialization

In Python, everything is an object, which is essentially a "memory block for storing data". Sometimes, we need to save the "memory block data" to the hard disk or transfer it to other computers through the network. At this time, just
Serialization and deserialization of objects is required. Object serialization mechanism is widely used in distributed and parallel systems.
Serialization refers to converting objects into "serialized" data form, storing them on hard disk or transmitting them to other places through network. Deserialization refers to the reverse process of converting the read "serialized data" into objects.
We can use the functions in the pickle module to realize serialization and deserialization

pickle.dump(obj, file) obj Is the object to be serialized, file Refers to stored files
pickle.load(file) from file Read the data and deserialize it into an object
import pickle
 Serialize objects into a file
with open(r'234.dat','wb') as f:
	a1 = 'x'
	a2 = 234
	a3 = [20,30]
	pickle.dump(a1,f)
	pickle.dump(a2,f)
	pickle.dump(a3,f)
Deserialize the obtained data into objects
with open(r"d:\data.dat","rb") as f:
	a1 = pickle.load(f)
	a2 = pickle.load(f)
	a3 = pickle.load(f)
	print(a1)
	print(a2)
	print(a3)

2.5 csv files

read

import csv
with open(r"d:\a.csv") as a:
	a_csv = csv.reader(a) #Create a csv object, which is a list of all data, one element per line
	headers = next(a_csv) #Gets a list object that contains information about the title row
	print(headers)
	for row in a_csv: #Cycle through lines
		print(row)

write in

import csv
headers = ["Job number","full name","Age","address","a monthly salary"]
rows = [("1001","Gao Qi",18,"Xisanqi No. 1 hospital","50000"),("1002","Gao Ba",19,"Xisanqi No. 1 hospital","30000")]
with open(r"d:\b.csv","w") as b:
	b_csv = csv.writer(b) #Create csv object
	b_csv.writerow(headers) #Write one line (title)
	b_csv.writerows(rows) #Write multiple rows (data)

3 OS module

os module can help us operate the operating system directly. We can directly call the executable of the operating system
Files, commands, direct operation of files, directories, etc.

3.1 calling operating system commands

os.system can help us call system commands directly
[operation] os.system calls windows

import os
os.system('notepad.exe')

[operation] os.system calls ping command in windows system

import os
os.system("ping www.baidu.com)

[operation] run the installed wechat

import os
os.startfile(r"C:\Program Files (x86)\Tencent\WeChat\WeChat.exe")

3.2 documents and directories

Common file operation methods of os module:

Operation method of directory:

3.3 os.path module

It provides directory related operations (path judgment, path segmentation, path connection, folder traversal)

methoddescribe
computer$1600
mobile phone$12
isabs(path)Determine whether the path is an absolute path
isdir(path)Determine whether the path is a directory
isfile(path)Determine whether the path is a text
exists(path)Judge whether the file in the specified path exists
getsize(filename)Returns the size of the file
abspath(path)Return absolute path
dirname§Returns the path to the directory
getatime(filename)Returns the last access time of the file
getmtime(filename)Returns the last modification time of the file
walk(top,func,arg)Traversing directories recursively
join(path,*paths)Connecting multiple path s
split(path)Split the path and return it as a list
splitext(path)Splits the file extension from the path

3.4 shutil module (copy and compression)

The shutil module is provided in the python standard library. It is mainly used to copy, move and delete files and folders; You can also compress and decompress files and folders.
The os module provides general operations on directories or files. As a supplement, the shutil module provides operations such as moving, copying, compressing and decompressing, which are not provided by these os modules.
[operation] copy files

import shutil
#copy file content
shutil.copyfile('1.txt','1_copy.txt')

[operation] copy folder contents recursively (using shutil module)

import shutil
#The music folder does not exist to use.
shutil.copytree("film/study","music
",ignore=shutil.ignore_patterns("*.html","*.htm"))

[operation] compress all contents of the folder (using the shutil module)

import shutil
import zipfile
#Compress all contents under the "movies / learning" folder into the "music 2" folder to generate movie.zip
#shutil.make_archive("music 2/movie","zip", "movie / learning")
#Compress: compress the specified multiple files into a zip file
#z = zipfile.ZipFile("a.zip","w")
#z.write("1.txt")
#z.write("2.txt")
#z.close()

[operation] decompress the compressed package to the specified folder (using the shutil module)

import shutil
import zipfile
#Decompression:
z2 = zipfile.ZipFile("a.zip","r")
z2.extractall("d:/") #Set the decompression address
z2.close()

Tags: Python

Posted on Fri, 08 Oct 2021 20:42:28 -0400 by mayanktalwar1988