Article catalog
- character string
- Basic characteristics
- code
- Representation (creation)
- Escape character
- String splicing
- String copy
- Common operations
- 1. Get length
- 2. Search content
- 3. Judgment
- 4. Calculate the number of occurrences
- 5. Replacement
- 6. Cut string
- 7. Modify case
- 8. Space handling
- 9. String splicing
- 10. Encryption and decryption (mapping replacement)
- 11. Fill 0 before string
- slice
- String resident mechanism
- Variable string
character string
Basic characteristics
The essence of string is: character sequence. Python strings are immutable.
Python does not support single character types, which are also used as a string.
code
- Python 3 directly supports Unicode and can represent characters in any written language in the world. The character of Python 3 is 16 bit Unicode by default, and ASCII is a subset of Unicode.
- Use the built-in function ord() to get the encoding for the character.
- Use the built-in function chr() to get the corresponding characters according to the encoding.
>>> ord('Horse') 39532 >>> chr(39532) 'Horse'
Representation (creation)
a = "I'm Tom" # A pair of double quotes b = 'Tom said:"I am Tom"' # A pair of single quotes c = '''Tom said:"I'm Tom"''' # A pair of three single quotes d = """Tom said:"I'm Tom" """ # A pair of three double quotes
Escape character
Use \ to indicate an escape character and \ at the end of a line to indicate a continuation character.
- \n for a newline
- \t for a tab
- \'display a normal single quotation mark
- \"Display a normal double quotation mark
- \Represents a normal backslash
- \r for a return
- \b represents a backspace
Note: in python, add r before the string to represent the native string
k = r'good mor\ning' print(k) # good mor\ning
String splicing
- str + str
- Space
Note: both methods generate new string objects
>>> 'a' + 'b' 'ab' >>> 'c' 'd' 'cd'
String copy
- str * int
>>> 'jack'*3 'jackjackjack'
Common operations
1. Get length
-
len function can get the length of string
mystr = 'It's a fine day today. It's beautiful everywhere' print(len(mystr)) # 17 get the length of the string
2. Search content
-
find
- Returns the starting index value of the first occurrence of the searched content in the string, or - 1 if it does not exist
- S.find(sub[, start[, end]]) -> int
-
rfind
-
Similar to the find() function, but starting from the right
str1 = 'hello' print(str1.rfind('l')) # 3
-
-
index
- Just like the find() method, it returns - 1 when the find method is not found, and an exception when the index is not found
-
rindex
- Similar to index(), but starting from the right
3. Judgment
-
startswith
-
Determine whether the string starts with the specified content
-
S.startswith(prefix[, start[, end]]) -> bool
print('hello'.startswith('he')) # True
-
-
endswith
-
Determine whether the string ends with the specified content
-
S.endswith(suffix[, start[, end]]) -> bool
print('hello'.endswith('o')) # True
-
-
isalpha
-
Determine whether the string is a pure letter
mystr = 'hello world' print(mystr.isalpha()) # False because there is a space in the middle
-
-
isdigit
-
To determine whether a string is a pure number, if there are numbers other than 0-9, the result is False
print('good'.isdigit()) # False print('123'.isdigit()) # True print('3.14'.isdigit()) # False
-
-
isalnum
-
Determine whether it is composed of numbers and letters. Returns False whenever non numeric and alphabetic characters appear
print('hello123'.isalnum()) # True print('hello'.isalnum()) # True
-
-
isspace
-
Returns True if the string contains only spaces, False otherwise
print(' '.isspace()) # True
-
-
isascii
-
Returns True if all characters in the string are ASCII; otherwise, returns False.
-
The judgment must be in the form of string, otherwise an error will be reported
a = 'a' print(a.isascii()) # True
-
-
isupper
-
Returns True if all the letters in the string are uppercase; otherwise, returns False
b = 'Hello' print(b.isupper()) # False c = 'LUCY' print(c.isupper()) # True
-
-
islower
-
Returns True if all the letters in the string are lowercase; otherwise, returns False
b = 'Hello' print(b.islower()) # False c = 'lucy' print(c.islower()) # True
-
-
isnumeric
-
Checks whether a string consists of only numbers. This method is only for unicode objects
- The numbers here include: Arabic numbers, roman numbers, Chinese Simplified numbers, Chinese traditional numbers
-
To define a string as Unicode, just add the "u" prefix before the string
str1 = u"One one①②③⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩ❶❷❸❺❺❻❼❽❾❿2009" print(str1.isnumeric()) # True
-
-
isprintable
-
Determines whether it is a printable string. If all characters are printable, returns True. Otherwise, returns False
-
Non printable characters can be carriage return, line feed, tab
str1 = 'abc' print(str1.isprintable()) # True
str2 = 'abc\tdef' print(str2.isprintable()) # False
-
-
istitle
-
Determine whether the first letter is uppercase and other letters are lowercase
str1 = 'LuCy' print(str1.istitle()) # False str2 = 'Lucy Ha' print(str2.istitle()) # True
-
-
isidentifier
-
Determine whether a string is a valid Python identifier
str1 = 'True' print(str1.isidentifier()) # True keyword cannot be detected as variable name str2 = '3abc' print(str2.isidentifier()) # False str3 = 'username' print(str3.isidentifier()) # True
-
A string is considered a valid identifier if it contains only alphanumeric or underscore characters. A valid identifier cannot begin with a number or contain any spaces
-
-
isdecimal
-
Returns True if the string contains only decimal characters, False otherwise
str1 = u"this2009" print(str1.isdecimal()) # False str2 = u"23443434" print(str2.isdecimal()) # True
-
This method only exists in unicode objects
-
4. Calculate the number of occurrences
-
count
-
Returns the number of times the queried string appears in the original string between start and end
-
S.count(sub[, start[, end]]) -> int
str1 = 'hello' print(str1.count('l')) # 2
-
5. Replacement
-
replace
-
Replace the content specified in the string. If you specify count times, the replacement will not exceed count times
-
replace(self, old, new, count)
msg = 'He's awesome, he's showy, he's handsome' msg1 = msg.replace('he', 'lucy') # Replace all by default msg2 = msg.replace('he', 'lucy', 2) # From left to right, replace the first two print(msg1) # lucy's awesome, lucy's pretty, lucy's handsome print(msg2) # lucy's awesome, lucy's pretty, he's handsome
-
The character to be replaced is not in the string and will not report an error
s1 = 'abcd' print(s1.replace('lucy', 'cc')) # abcd
-
-
In the whole process, a new string is actually created, and the original string does not change.
-
6. Cut string
-
split
-
You can cut a string into a list
-
The default maximum number of divisions is - 1, which means unlimited and can be omitted; you can also specify the maximum number of divisions by yourself
x = 'zhangsan-hahaha-tom-tony-lucy' y = x.split('-', -1) z = x.rsplit('-') print(y) # ['zhangsan', 'hahaha', 'tom', 'tony', 'lucy'] print(z) # ['zhangsan', 'hahaha', 'tom', 'tony', 'lucy']
x = 'zhangsan-hahaha-tom-tony-lucy' print(x.split('-', 2)) # ['zhangsan', 'hahaha', 'tom-tony-lucy']
x = '-hahaha-tom-tony-lucy' y = x.split('-') print(y) # ['', 'hahaha', 'tom', 'tony', 'lucy']
-
By default, it is divided by spaces, line breaks, and tabs
s = 'my name is lucy' s1 = s.split() print(s1) # ['my', 'name', 'is', 'lucy']
-
-
rsplit
-
The usage is basically the same as split, only from right to left
x = 'zhangsan-hahaha-tom-tony-lucy' print(x.rsplit('-', 2)) # ['zhangsan-hahaha-tom', 'tony', 'lucy']
-
-
splitlines
-
Separated by rows, returns a list containing rows as elements
str1 = 'hello\nworld' print(str1.splitlines()) # ['hello', 'world']
-
-
partition
-
Specify a string STR as the separator, and divide the original string into three parts: before STR, after STR, and after str. These three parts make up a tuple
print('agdaXhhXhjjs'.partition('X')) # ('agda', 'X', 'hhXhjjs')
-
-
rpartition
-
Similar to the partition() function, but starting from the right
print('agdaXhhXhjjs'.rpartition('X')) # ('agdaXhh', 'X', 'hjjs')
-
7. Modify case
-
capitalize
-
Capitalize the first word
mystr = 'hello world' print(mystr.capitalize()) # Hello world
-
-
title
-
Capitalize each word
mystr = 'hello world' print(mystr.title()) # Hello World
-
-
lower
-
All lowercase
mystr = 'hElLo WorLD' print(mystr.lower()) # hello world
-
-
upper
-
All capitalized
mystr = 'hello world' print(mystr.upper()) #HELLO WORLD
-
-
casefold()
-
Converts all uppercase letters in a string to lowercase characters
s1 = 'I Love Python' print(s1.casefold()) # i love python
-
-
swapcase()
-
Change small and medium write of string to upper case and upper case to lower case
s1 = 'I Love Python' print(s1.swapcase()) # i lOVE pYTHON
-
8. Space handling
-
ljust
-
Returns a string of a specified length, complemented (left justified) with white space characters on the right
str = 'hello' print(str.ljust(10)) # hello filled in five spaces on the right
-
If its length is greater than the specified length, no processing will be done
-
Fill characters can be specified, and the default is space
print('lucy'.ljust(10, '+')) # lucy++++++
-
-
rjust
-
Returns a string of a specified length, with white space on the left (right justified)
str = 'hello' print(str.rjust(10)) # hello filled in five spaces on the left
-
Fill characters can be specified, and the default is space
-
-
center
-
Returns a string of a specified length, complemented (centered) with white space characters at both ends
str = 'hello' print(str.center(10)) # Add spaces at both ends of hello to center the content
-
Fill characters can be specified, and the default is space
-
-
Remove the leading and trailing blanks, including spaces, tabs, and line breaks
-
lstrip
-
Remove the white space character to the left of the string
mystr = ' he llo ' print(str.lstrip()) #He LLO only removes the spaces on the left, and the spaces in the middle and right are reserved
-
-
rstrip
-
Remove the white space character to the right of the string
mystr = ' he llo ' print(str.rstrip()) # The space to the right of he LLO is removed
-
-
strip
-
Remove white space characters on both sides of a string
str = ' he llo ' print(str.strip()) #he llo
-
Specify delete character
s = 'fgk too k white ser' s1 = s.strip('fkgres') # At the same time print(s1) # Too white
-
-
-
expandtabs()
-
Turn the tab symbol ('\ t') in the string into a space. The default number of spaces for the tab symbol ('\ t') is 8
s1 = 'a\tbcd' print(s1) print(s1.expandtabs()) print(s1.expandtabs(tabsize=4)) print(s1.expandtabs(tabsize=0)) # a bcd # a bcd # a bcd # abcd
-
9. String splicing
-
join
-
S.join(iterable)
s = 'lucy' s1 = '+'.join(s) print(s1) # l+u+c+y
print('+'.join({'name': 'lucy', 'age': 18})) # name+age
-
Function: can quickly convert a list or tuple into a string, separated by specified characters
-
Premise: the elements in a list or tuple must be of str type
l1 = ['my', 'name', 'is', 'lucy'] s1 = ' '.join(l1) print(s1) # my name is lucy
-
-
It is recommended to use this method for string splicing, which is more efficient than str+str, because join only creates string objects once.
-
10. Encryption and decryption (mapping replacement)
-
maketrans
Create a conversion table for character mapping.
str.maketrans(intab, outtab,delchars)
intab -- a string of characters to replace in a string. outtab -- the corresponding string of mapped characters. delchars -- optional parameter, indicating that each character in the string will be mapped to None intab and outtab are strings and must be the same length
-
translate
The characters in the string are converted according to the conversion table given by the maketrans() function.
Note: filter first (turn to None), then convert
in_str = 'afcxyo' out_str = '123456' # maketrans() generates the transformation table, which must be called with str # map_ The type of table is Dictionary map_table = str.maketrans(in_str, out_str) # Use translate() for conversion my_love = 'I love fairy' new_my_love = my_love.translate(map_table) print(new_my_love) # I l6ve 21ir5
in_str = 'afcxyo' out_str = '123456' # maketrans() generates the transformation table, which must be called with str map_table = str.maketrans(in_str, out_str, 'yo') # Use translate() for conversion my_love = 'I love fairy' new_my_love = my_love.translate(map_table) print(new_my_love) # I lve 21ir
11. Fill 0 before string
-
zfill()
- Returns a string of a specified length. The original string is right justified and filled with 0
a = 3 b = str(a).zfill(4) print(b) # 0003
- Use scenario
The number in string format is not the same as we expected when sorting. For example, 11 is in front of 2, which brings some problems. For example, when merging some files named with numbers, the order of merging files may change. Then fill 0 in front of the numbers to keep the length of these numbers consistent, and the problem will be solved.
slice
Slicing: to copy a specified section of content from a string and generate a new string.
m[start:end:step] head and tail
-
Step: step. Default is 1
m = 'abcdefghigklmnopqrstuvwxyz' print(m[2:9]) # cdefghi
Step size cannot be 0, otherwise error will be reported
m = 'abcdefghigklmnopqrstuvwxyz' print(m[2:9:0]) # report errors
-
When the step size is negative, it means getting from right to left
m = 'abcdefghigklmnopqrstuvwxyz' print(m[3:15:-1]) # no data print(m[15:3:-1]) # ponmlkgihgfe
-
If start and end are negative numbers, the index is from the right
m = 'abcdefghigklmnopqrstuvwxyz' print(m[-9:-5]) # rstu
-
Reverse order
m = 'abcdefghigklmnopqrstuvwxyz' print(m[::-1]) # zyxwvutsrqponmlkgihgfedcba
-
If only start is set, it will "intercept" to the end
m = 'abcdefghigklmnopqrstuvwxyz' print(m[2:]) # cdefghigklmnopqrstuvwxyz
-
If only end is set, it will "intercept" from the beginning
m = 'abcdefghigklmnopqrstuvwxyz'print(m[:9]) # abcdefghi
-
If start and end are not in the range of [0, string length - 1], no error will be reported
m = 'abcdefghigklmnopqrstuvwxyz' print(m[-100:-1]) # abcdefghigklmnopqrstuvwxy
String resident mechanism
String resident: a method to save only one identical and immutable string. Different values are stored in the string resident pool.
Python supports string resident mechanism, for strings that conform to identifier rules (only including underscores (?) , letters, and numbers) enables the string resident mechanism.
>>> a = 'abc_123' >>> b = 'abc_123' >>> a is b True >>> c = 'abc#' >>> d = 'abc#' >>> c is d False
Variable string
In Python, strings belong to immutable objects and do not support in place modification. If you need to modify the value, you can only create a new string object. However, often we do need to modify the string in place. You can use io.StringIO Object or array module, no new string will be created.
>>> import io >>> s = 'hello, Lucy' >>> sio = io.StringIO(s) >>> sio <_io.StringIO object at 0x7f8bbfdd8948> >>> sio.seek(4) 4 >>> sio.write('k') 1 >>> sio.getvalue() 'hellk, Lucy'