- re.compile(pattern, flags=0)
Create a reusable variable from a regular expression.
The generated object can be used in other places repeatedly, instead of generating the same expression again every time it is passed.
The flags parameter will be described later. Multiple can be declared with |
import re regex = re.compile("\\w+") ret = re.findall(regex,"afsd fasd11as 22d") print(ret) ret = re.findall(regex,"afsd fasd112as 223as") print(ret)
- flags
- re.A re.ASCII
Match only ascii, not unicode.
- re.I re.IGNORECASE
Ignore case matching
import re regex = re.compile("[A-Z]+",re.I) ret = re.findall(regex,"afsd fasd11as 22d") print(ret) ret = re.findall(regex,"afsd fasd112as 223as") print(ret)
- re.M re.MULTILINE
This will affect the matching of ^ $, as well as the parsing mode
import re data = """ asdfd asdfas basdfdsaf aa asada """ regex = re.compile("^[A-Z]+$",re.I|re.MULTILINE) ret = re.findall(regex,data) print(ret) ret = re.findall(regex,data) print(ret)
- re.S re.DOTALL
Match all characters including line breaks
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.findall("^.+$",data,re.I|re.S) print(ret) ret = re.findall("^.+$",data,re.I) print(ret)
- re.A re.ASCII
- re.search(pattern, string, flags=0)
Find the first one that meets the rule and return a match object
re.findall is a search for all matching strings, and re.match is a match from scratch.
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.search("^.+$",data,re.I|re.S) print(ret)
- re.match(pattern, string, flags=0)
It is related to or * or? If the match length is 0, there is a match
There is a difference between a mismatch and a match with a length of 0. A match with a length of 0 means that your rule allows a match with a length of 0
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.match("^.+$",data,re.I|re.S) print(ret) ret = re.match(".*",data) print(ret)
- re.fullmatch(pattern, string, flags=0)
Strictly match from beginning to end, slightly.
- re.split(pattern, string, maxsplit=0, flags=0)
The maximum number of maxplits is separated according to the separation rule pattern. Maxplit = 0 means to separate all
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.split("\n",data,re.I|re.S) print(ret)
- re.findall(pattern, string, flags=0)
Returns a string set that is not repeated. This string is scanned from left to right. All matched strings are stored in order. If there are groups, they are stored in the form of tuples. If there is no match, they are returned as empty sets.
A tuple is a string '' and a tuple is a tuple type ('', ''). The first is where the first left bracket appears, and the brackets are paired.
- Monadic group
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.findall("a(\\w+)d",data,re.I|re.S) print(ret)
- Two tuple
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.findall("(a(\\w+)d)",data,re.I|re.S) print(ret)
- Monadic group
- re.finditer(pattern, string, flags=0)
Returned as an iterator.
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.finditer("(a(\\w+)d)",data,re.I|re.S) print(ret) for i in ret: print(i.groups())
- re.sub(pattern, repl, string, count=0, flags=0)
- Pattern: string or regex
- Repl: string or a method that accepts a parameter. The parameter passed in is a match object
- String: scan string
- count: the maximum number of substitutions. 0 means all
- Flags: additional parsing flags, Reference link.
Returns the replaced string. repl can contain groups.
- String substitution
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.sub("(a(\\w+)d)","regex",data,re.I|re.S) print(ret)
- Method substitution
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.sub("a(\\w+)d",lambda a: a.group()[1:-1],data,re.I|re.S) print(ret)
- Group replacement
import re data = """ asdfd asdfas basdfdsaf aa asada """ ret = re.sub("a(\\w+)d","\\1",data,re.I|re.S) print(ret)
- re.escape(pattern)
Pass in a string and return the regex object of the string. And will have a special meaning in the regular expression character escape.
import re data = """..""" ret = re.escape(data) print(ret)
Some methods of re
re.compile(pattern, flags=0) Create a reusable variable from a regular expression. The generated object can be used in other places repeatedly, inste...