Regular expression and grep instruction in Shell programming

#In fact, regular expressions complete data filtering, reject the data defined by the unsatisfied regular expressions, and the rest data matching the regular expressions

#Metacharacters: shell gives them meaning beyond the literal meaning

#To master the basic elements of regular expressions is mainly to master the meaning of metacharacters in regular expressions

*: matches the previous normal character 0 or more times
Hel * o: helo Hello helllo can match
.: match any character
... 73.: it should be noted that the "." symbol can match a space
^: match top of line
^ cloud: match lines starting with cloud

# ^...X86*

$: match end of line
    micky$

#Match all blank lines^$
#A line containing one character ^*

[]: matching character set. This symbol supports exhaustive method to list all elements of character set, and also supports' - 'symbol to represent character set range, indicating that character set range starts from' - 'left character and ends on' - 'right character
#Match any number
[0123456789]
[0-9]
#Match letters
[a-z]
[A-Z]
[b-p]
#Combination of ^ and [] indicates negation
#Letters other than b-d
[^b-d]
#Match all English letters
#Start with any letter, and repeat 0 times or any times with any letter
[A-Za-z] [A-Za-z]*

\< \ > exactly matches the symbol, which masks the symbol with '\'
#Exactly match the
\<the\>

\{\}: similar to the * symbol, it means the repetition of the previous character, but the * symbol means the repetition 0 times or any times, and \ {\} can specify the repetition times
#\ {n \}: matches previous characters n times
#\ {n, \}: matches preceding characters at least N times
#\ {n,m \}: occurrence of characters before matching n~m times
JO\{3}\B ා repeat character O 3 times
JO\{3,\}\B ා repeat character O at least 3 times
JO\{3,6\}\B ා repeat character O 3~6 times

#Match exactly 5 lowercase letters
[a-z]\{5\}

awk also supports regular expression extension
Character before?:? 0 or 1 time
JO?B #JOB JOOB
#Can match up to 1 character

+: characters before + match multiple times (at least once different from * can match 0 times)
S+EU? Sseu sssseu cannot be matched by S+EU

() and |: use in combination to represent a set of optional characters
re(a|e|o)d == re[aeo]d
#Choose any character in a e o


"""
bash shell does not support regular expressions. Shell commands and tools use regular expressions. bash shell, such as grep sed awk, can use some metacharacters in regular expressions to realize common matching
"""

grep command
#Linux shell programming from beginner to proficient 52 pages

1.-c output the number of matching string lines. By default, the grep command prints out all lines containing patterns. Once the - c option is added, only the number of lines containing patterns will be displayed
[root@server1 ~]# grep -c root /etc/passwd
2

2.-n list all matching lines and display line numbers

[root@server1 ~]# grep -n root /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
10:operator:x:11:0:operator:/root:/sbin/nologin

3.-v display all lines without mode

[root@server1 ~]# grep -v root /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
# Display the number of lines without keywords in combination with the - c parameter
[root@server1 ~]# grep -vc root /etc/passwd
19

4.-i case insensitive

# grep is case sensitive
[root@server1 ~]# grep -i root /tmp/passwd
root:x:0:0:root:/root:/bin/bash
ROOT:x:0:0:ROOT:/ROOT:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

5.-s does not display error messages

[root@server1 ~]# grep -s root dd
[root@server1 ~]# grep  root dd
grep: dd: No such file or directory

6.-r grep command searches only the files in the current directory, not the files in the subdirectory - r option means local search

The pattern of - w grep command supports regular expression. The metacharacters of regular expression are interpreted as special meanings. The - w option means to match the whole sentence, that is, to parse it with the literal meaning of pattern
, so when the grep command uses the - w option, metacharacters are no longer interpreted as special

[root@server1 ~]# grep roo* /tmp/passwd 
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
[root@server1 ~]# grep roo* /tmp/passwd 
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
[root@server1 ~]# grep roo* /tmp/passwd 
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
[root@server1 ~]# grep -w roo* /tmp/passwd

8.-x is to match the whole line, that is, the grep command only outputs the line result when the current file has the whole line content matching the pattern

[root@server1 ~]# grep -w 'World' world.txt 
Hello World
World
World Cup
[root@server1 ~]# grep -x 'World' world.txt 
World
[root@server1 ~]# cat world.txt 
Hello World
World
World Cup
Westos
One One world

9.-q grep will no longer output any results, but will indicate whether the search is successful or not with the exit status

[root@server1 ~]# grep -q -x 'World' world.txt 
[root@server1 ~]# echo $?
0
[root@server1 ~]# grep -q -x 'World dd' world.txt 
[root@server1 ~]# echo $?
1
[root@server1 ~]# grep -q -x 'World' world
grep: world: No such file or directory

grep A set of columns used with regular expressions
# Lines with suffix d and beginning with -
[root@server1 ~]# grep ^- *d
----------------BEGIN
----------------BEGIN
----------------BEGIN
----------------BEGIN
----------------BEGIN
----------------BEGIN
# Find blank lines (print only lines)
[root@server1 ~]# grep -c ^$ /tmp/passwd 
3

# Find non blank lines (only print lines)
[root@server1 ~]# grep -c ^[^$] /tmp/passwd 
22

# Using [] to realize grep case insensitive
[root@server1 ~]# grep -n [Rr]oot /tmp/passwd 
1:root:x:0:0:root:/root:/bin/bash
2:Root:x:0:0:Root:/Root:/bin/bash
15:operator:x:11:0:operator:/root:/sbin/nologin

# Line starting with / and starting with any 4 characters in the middle the sixth character is still /
[root@server1 ~]# grep ^/..../ /tmp/passwd 

sed


As we know, Vim uses interactive text editing mode. You can use keyboard commands to interactively insert, delete or replace text in data. However, the SED command in this section is different. It adopts the flow editing mode. The most obvious feature is that before sed processes data, it needs to provide a set of rules in advance. Sed will edit data according to this rule

sed processes the data in the text file according to the script commands. These commands are either input from the command line or stored in a text file. The order of data execution of this command is as follows:
Read only one line at a time;
Match and modify data according to the provided rule command. Note that sed will not directly modify the source file data by default, but will copy the data to the buffer, and only modify the data in the buffer;
The result output will be executed.

When a row of data matches, it continues to read the next row of data and repeats the process until all data in the file is processed

The basic format of the sed command is as follows:
[root @ localhost ~] (SED [options] [script command] filename sed s replace script command
The basic format of this command is:
[address]s/pattern/replacement/flags
Where address refers to the specific line to be operated on, pattern refers to the content to be replaced, and replacement refers to the new content to be replaced
#Specifies where sed replaces the pattern match with new text
#As you can see, the result of using the number 2 as a marker is that the sed editor only replaces the second occurrence of the match pattern in each row

[root@server1 ~]# sed 's/Tue/Tua/2' date.txt 
Tue Tua Dec 17 15:40:54 CST 2019
Tue Tua Dec 17 15:40:57 CST 2019
Tue Tua Dec 17 15:40:57 CST 2019
Tue Dec 17 15:40:59 CST 2019
Tue Dec 17 15:41:00 CST 2019
Tue Dec 17 15:41:01 CST 2019
[root@server1 ~]# cat date.txt 
Tue Tue Dec 17 15:40:54 CST 2019
Tue Tue Dec 17 15:40:57 CST 2019
Tue Tue Dec 17 15:40:57 CST 2019
Tue Dec 17 15:40:59 CST 2019
Tue Dec 17 15:41:00 CST 2019
Tue Dec 17 15:41:01 CST 2019

# If you want to replace all matching strings with new files, you can use the g tag
[root@server1 ~]# cat date.txt 
Tue Dec 17 15:40:54 CST 2019
Tue Dec 17 15:40:57 CST 2019
Tue Dec 17 15:40:57 CST 2019
Tue Dec 17 15:40:59 CST 2019
Tue Dec 17 15:41:00 CST 2019
Tue Dec 17 15:41:01 CST 2019
[root@server1 ~]# sed 's/Tue/Tua/g' date.txt 
Tua Dec 17 15:40:54 CST 2019
Tua Dec 17 15:40:57 CST 2019
Tua Dec 17 15:40:57 CST 2019
Tua Dec 17 15:40:59 CST 2019
Tua Dec 17 15:41:00 CST 2019
Tua Dec 17 15:41:01 CST 2019
[root@server1 ~]# cat date.txt 
Tue Dec 17 15:40:54 CST 2019
Tue Dec 17 15:40:57 CST 2019
Tue Dec 17 15:40:57 CST 2019
Tue Dec 17 15:40:59 CST 2019
Tue Dec 17 15:41:00 CST 2019
Tue Dec 17 15:41:01 CST 2019

# The w tag will save the matching results to the specified file, for example:
[root@server1 ~]# sed 's/Tue/Tua/w date2.txt' date.txt 
Tua Tue Dec 17 15:40:54 CST 2019
Tua Tue Dec 17 15:40:57 CST 2019
Tua Tue Dec 17 15:40:57 CST 2019
Tua Dec 17 15:40:59 CST 2019
Tua Dec 17 15:41:00 CST 2019
Tua Dec 17 15:41:01 CST 2019
[root@server1 ~]# cat date2.txt 
Tua Tue Dec 17 15:40:54 CST 2019
Tua Tue Dec 17 15:40:57 CST 2019
Tua Tue Dec 17 15:40:57 CST 2019
Tua Dec 17 15:40:59 CST 2019
Tua Dec 17 15:41:00 CST 2019
Tua Dec 17 15:41:01 CST 2019

#We know that the - n option will disable sed output, but the p tag will output the modified lines. The effect of matching the two is to output only the lines modified by the replacement command, for example:

[root@server1 ~]#  sed -n 's/test/trial/p' data3.txt
This is a trial line.
[root@server1 ~]# cat data3.txt 
This is a test line.
This is a different line.

#When using the s script command, it is troublesome to replace the string similar to the file path. You need to escape the forward slash in the path, for example:

[root@server1 ~]# sed 's/\/bin\/bash/\/bin\/csh/' /tmp/passwd 
root:x:0:0:root:/root:/bin/csh
Root:x:0:0:Root:/Root:/bin/csh
ROOT:x:0:0:ROOT:/ROOT:/bin/csh
bin:x:1:1:bin:/bin:/sbin/nologin

daemon:x:2:2:daemon:/sbin:/sbin/nologin

adm:x:3:4:adm:/var/adm:/sbin/nologin

lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
polkitd:x:999:998:User for polkitd:/:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
varnishlog:x:998:996:varnishlog user:/dev/null:/sbin/nologin
varnish:x:997:996:Varnish Cache:/var/lib/varnish:/sbin/nologin

# http://c.biancheng.net/view/4028.html


sed d replace script command
The basic format of this command is:
[address]d

#If you need to delete a specific line in the text, you can use the d script command, which will delete everything in the specified line. But be careful when using this command. If you forget to specify a specific line, everything in the file will be deleted. For example:
 

[root@server1 ~]# cat data4.txt 
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
# #Nothing is output. The proof is empty
[root@server1 ~]# sed 'd' data4.txt
[root@server1 ~]# cat data4.txt 
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog# Specify by line number, for example, delete line 3 in the contents of data6.txt file:
[root@server1 ~]# cat data6.txt 
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
[root@server1 ~]# sed '3d' data6.txt
This is line number 1.
This is line number 2.
This is line number 4.
[root@server1 ~]# cat data6.txt 
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.


#Or specify by a specific line interval, such as deleting lines 2 and 3 in the data6.txt file:
[root@server1 ~]# sed '2,3d' data6.txt
This is line number 1.
This is line number 4.

#It is emphasized here that, by default, sed does not modify the original file. The deleted lines just disappear from the output of SED, and the original file does not change anything


sed a and i script commands
Command a means to append a line after the specified line, and command i means to insert a line before the specified line. Here, we want to introduce these two script commands at the same time, because their basic formats are the same, as follows:

[address]a (or i) \ new text content

#Insert a new row before the third row of the data flow, and execute the command as follows:

[root@server1 ~]# sed '3i\
> This is an inserted line.' data6.txt
This is line number 1.
This is line number 2.
This is an inserted line.
This is line number 3.
This is line number 4.

#For example, after attaching a new row to the third row in the data flow, execute the following command:

[root@server1 ~]# sed '3a\
> This is an appended line.' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is an appended line.
This is line number 4.

#If you want to add a multiline data to the data flow, just add a backslash at the end of each line (except the last line) in the text to be inserted or appended, for example

[root@server1 ~]# sed '1i\
> This is one line of new text.\
>  This is another line of new text.' data6.txt
This is one line of new text.
 This is another line of new text.
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.


#As you can see, both of the specified lines are added to the data flow

sed c replace script command
The c command means to replace everything in the specified line with a string after the option. The basic format of this command is:

[address]c\New text for replacement
[root@server1 ~]# sed '3c\
> This is a changed line of text.' data6.txt
This is line number 1.
This is line number 2.
This is a changed line of text.
This is line number 4.

#In this case, the sed editor will modify the text in the third line. In fact, the following writing method can also achieve this purpose

[root@server1 ~]# sed '/number 3/c\
> This is a changed line of text.' data6.txt
This is line number 1.
This is line number 2.
This is a changed line of text.
This is line number 4.

sed y conversion script command
The y conversion command is the only sed script command that can handle a single character. Its basic format is as follows:

[address]y/inchars/outchars/

The conversion command will map the values of inchars and outchars one-to-one, that is, the first character in inchars will be converted to the first character in outchars, and the second character will be converted to the second character in outchars... This mapping process will continue until the specified character is processed. If the lengths of inchars and outchars are different, sed will generate an error message

[root@server1 ~]# cat data6.txt 
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
[root@server1 ~]#  sed 'y/123/789/' data6.txt
This is line number 7.
This is line number 8.
This is line number 9.
This is line number 4.


#As you can see, each instance of the specified character in the inchars pattern is replaced with the character in the same position in the outgoing pattern

#The conversion command is a global command, that is to say, it will automatically convert all the specified characters found in the text line without considering their location. For another example:
[root@server1 ~]# echo "This 1 is a test of 1 try." | sed 'y/123/456/'
This 4 is a test of 4 try.
#sed converts two instances of the character 1 matched in the text line. We cannot restrict the conversion to only the words that appear in a specific place

sed p print script command
The p command represents a line for searching symbol conditions and outputs the contents of the line. The basic format of this command is:
[address]p

A common use of the p command is to print lines that contain matching text patterns, such as:

[root@server1 ~]# cat data6.txt 
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
[root@server1 ~]# sed -n '/number 3/p' data6.txt
This is line number 3.
# As you can see, with the - n option and the p command, we can disable the output of other lines and only print the lines containing the matching text pattern

sed w script command
The w command is used to write the contents of the specified line in the text to a file. The basic format of this command is as follows:
[address]w filename

#The filename here represents the file name. You can use relative path or absolute path, but no matter which, the user running the sed command must have write permission for the file

#Print the first two lines of the data stream into a text file:

[root@server1 ~]# sed '1,2w test.txt' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
[root@server1 ~]# cat test.txt 
This is line number 1.
This is line number 2.

Of course, if you don't want direct output, you can use the - n option to give another example:

[root@server1 ~]# sed -n '/Browncoat/w Browncoats.txt' data11.txt
[root@server1 ~]# cat Browncoats.txt 
Blum, R       Browncoat
Bresnahan, C  Browncoat
[root@server1 ~]# cat data11.txt 
Blum, R       Browncoat
McGuiness, A  Alliance
Bresnahan, C  Browncoat
Harken, C     Alliance


#As you can see, by using the w script command, sed can write data lines containing text patterns to the target file


sed r script command
The r command is used to insert the data of a separate file into the specified location of the current data flow. The basic format of the command is:
[address]r filename

#The sed command inserts the contents of the filename file after the address specified line

[root@server1 ~]# cat data12.txt 
This is an added line.
This is the second added line.
[root@server1 ~]#  sed '3r data12.txt' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is an added line.
This is the second added line.
This is line number 4.

sed q exit script command
The function of the q command is to make the sed command exit the sed program after the first matching task, and no longer process the subsequent data. As you can see, the sed command stops after the first line of the printout, which is caused by the q command

[root@server1 ~]# sed '1q' test.txt
This is line number 1.
[root@server1 ~]# cat test.txt 
This is line number 1.
This is line number 2.


Specifying line intervals in numerical form
When using numeric row addressing, it can be referenced by the row position in the text stream. sed numbers the first line in the text stream to 1, and then continues to assign line numbers to the next lines in order.
In a script command, the specified address can be a single line number, or a line within a certain range specified by the start line number, comma, and end line number. Here is an example where the sed command applies to a specified line number:

[root@server1 ~]# cat data4.txt 
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
[root@server1 ~]# sed '2s/dog/cat/' data4.txt
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog

#As you can see, sed only modifies the text of the second line specified by the address. The following example uses a row address range

[root@server1 ~]# sed '2,3s/dog/cat/' data4.txt 
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy dog

On this basis, if you want to apply the command to all lines in the text starting from a certain line, you can use the special address - dollar sign ($):

[root@server1 ~]#  sed '2,$s/dog/cat/' data4.txt
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat


 

Published 67 original articles, won praise 14, visited 1447
Private letter follow

Tags: shell ftp network REST

Posted on Fri, 14 Feb 2020 02:17:37 -0500 by fatal