Programming Three Swordsmen (grep, sed, awk) in the regular expression of shell


sed editor

1. sed concept

sed is a stream editor that edits a stream based on a set of pre-provided rules before the editor processes the data.
The sed editor can process data in a data stream based on commands that are either entered from the command line or stored in a command text file.

2. sed's Workflow

sed's workflow mainly consists of three processes: reading, executing and displaying:
Read: sed reads a line from the input stream (file, pipe, standard input) and stores it in a temporary buffer (also known as pattern space).
Execution: By default, all sed commands are executed sequentially in mode space, unless the address of the line is specified, the SED command will execute sequentially on all lines.
Display: Send the modified content to the output stream. After sending the data, the schema space will be emptied. The above process will repeat until all the content of the file is processed.
The above process repeats until all the contents of the file are processed.
Note: By default, all sed commands are executed in mode space, so the input file will not change unless the output is stored in redirection.

3. sed command format

Command Format

Format 1:
sed [option] 'operation' File 1 [File 2] ...
Format 2:
sed [option] 'option{
Operation 1
 Operation 2
...
}' File 1 [File 2] ...

Common options:

-e or--expression=: Indicates that the input text file is processed with the specified command, which can be omitted when only one operation command is executed. Generally, it is used when executing multiple operation commands.
-f or--file=: Indicates that the input text file is processed with the specified script file.
-h or--help: Show help.
-n,--quiet or silent: prohibit sed Editor output, but can be compared with p Commands are used together to complete the output.
-i: Modify the target text file directly.

Common operations:

s: Replace, replace the specified character.
d: Delete, delete the selected row.
a: Increase by adding a line below the current line.
i: Insert, inserting a row above the selected row to specify content.
c: Replace, replacing the selected row with the specified content.
y: Character conversion. The length of characters before and after conversion must be the same.
p: Print, if lines are specified at the same time, to print the specified lines; if no lines are specified, to print everything; if there are non-printing characters, to print ASCII Code output. This is usually the same as "-n"Options are used together.
=: Print line numbers.
l(A lowercase letter L): Print text and non-printable in data stream ASCII Characters (such as terminators) $,Tab character\t)

4. Use sed command

1. Print Content

sed -n 'p' sed.txt
#-n is used with -p to print once, or twice if -n is not added

(2) Print line number

[root@localhost ~]#sed -n -e '=' sed.txt       # -n-e'='is to print line numbers only
1
2
3
4
5
6
7
8
9
10
11
12

[root@localhost ~]#sed -e '=' sed.txt       #If -n is not added, both line numbers and contents are printed
[root@localhost ~]#sed -n '=;p' sed.txt    #As above, both line numbers and contents are printed
[root@localhost ~]#sed -n -e '=' -e 'p' sed.txt     #The result is the same as above, meaning line numbers are printed before content is printed
[root@localhost ~]#sed -n '               #This method uses less, but it also means printing line numbers before printing content             
=
p
' sed.txt
#Then, if you want to print the contents before the line number, you can use the operation'p'before the operation'='
-----------------------------------------------------------------
1
one
2
two
3
three
4
four
5
five
6
six

sed '=' sed.txt
sed -n '=' sed.txt

(3) Print ASCII characters

sed -n 'l' sed.txt

[root@localhost ~]#sed -n -e 'l' sed.txt      #Plus-l is to print ASCII characters
one$
two$
three$
four$
five$
six$

2. Use Address
The sed editor has two addressing methods:

Numerically represent row intervals
Filter travel using text mode
------------------------------------------------------------------------------------------------------------------------------------------------------------

[root@localhost ~]#sed -n '1p' sed.txt      #Print the contents of the first line
one
[root@localhost ~]#sed -n '$p' sed.txt        #Print the contents of the last line
six
[root@localhost ~]#sed -n '1,3p' sed.txt      #Print 1-3 lines
one
two
three
[root@localhost ~]#sed -n '3,$p' sed.txt       #Print 3 to last lines
three
four
five
six

[root@localhost ~]#sed -n '1,+3p' sed.txt      #Print three lines after the first line, that is, one to four lines
one
two
three
four
[root@localhost ~]#sed '5q' sed.txt           #Exit after the first 5 lines of printing, note there is no -e 
one
two
three
four
five

[root@localhost ~]#sed -n 'p;n' sed.txt        #Print an odd number of lines, meaning that the first line is printed with p first, and then n is used to skip one line, print it out, and skip to the end
one
three
five

[root@localhost ~]#sed -n 'n;p' sed.txt        #Print even lines, meaning to skip one line at a time starting from the first line
two
four
six

[root@localhost ~]#sed -n '2,${n;p}' sed.txt     #This command also means to print odd rows
three
five

3. Delete rows

[root@localhost ~]#sed 'd' testfile1         #Delete all rows
[root@localhost ~]# 
[root@localhost ~]#sed '3d' testfile1        #Delete the third line
one
two
four
five
six

[root@localhost ~]#sed '2,9d' testfile1      #Delete 2-9 rows
one
ten
eleven
twelve


[root@localhost ~]#sed '$d' testfile1        #Delete last line

[root@localhost ~]#sed '/^$/d' testfile1     #Delete blank lines
[root@localhost ~]#sed -i '/^$/d' testfile1    #Since all of the previously printed contents are not directly modified files, if you want to modify the contents of the file you can use the -i operation to make direct modifications

[root@localhost ~]#sed '/nologin$/d'  /etc/passwd     #Delete rows ending with nologin

[root@localhost ~]#sed '/nologin$/!d'  /etc/passwd    #! means retract, i.e. delete all lines except those ending with nologin

[root@localhost ~]#sed '/2/,/3/d' testfile2   #Turn on line deletion from the first position and turn off line deletion from the second position, that is, from the first line with character 2 to the line position with character 3

4. Replacement
Format: Line range s/Old string/New string/Replace tag
Four alternative markers
Number: Indicates where the new string will replace the match
g: Indicates that the new string will replace all matches
p: Print the line that matches the replacement command, used with -n
w file: write the result of the replacement to the file

2. awk editor

In Linux/UNIX systems, awk is a powerful editing tool that reads input text line by line, searches according to a specified matching pattern, formats and outputs qualified content or filters it. It can achieve quite complex text operations without interaction, and is widely used in Shell scripts to complete various automated configuration tasks.

(1) Working principle:

Reads the text line by line, separated by a space or tab key by default, saves the separated fields into built-in variables, and executes editing commands by mode or condition.

The sed command is often used to process a whole line, but awk prefers to divide a line into multiple "fields" before processing. The awk information is read in line by line, and the result of execution can be printed and displayed by the print function. In the process of using awk command, the logical operators "&&" and "|| represent" or "!" can be used to represent the field data."Not"; simple mathematical operations such as +, -, *, /,%, ^ can also be performed to represent addition, subtraction, multiplication, division, redundancy, and multiplication, respectively.

(2) Format of command:

awk option'mode or condition {action}'file 1 file 2...
Awk-f script file file file 1 file 2...

2. awk editor

In Linux/UNIX systems, awk is a powerful editing tool that reads input text line by line, searches according to a specified matching pattern, formats and outputs qualified content or filters it. It can achieve quite complex text operations without interaction, and is widely used in Shell scripts to complete various automated configuration tasks.

(1) Working principle:

Reads the text line by line, separated by a space or tab key by default, saves the separated fields into built-in variables, and executes editing commands by mode or condition.

The sed command is often used to process a whole line, but awk prefers to divide a line into multiple "fields" before processing. The awk information is read in line by line, and the result of execution can be printed and displayed by the print function. In the process of using awk command, the logical operators "&&" and "|| represent" or "!" can be used to represent the field data."Not"; simple mathematical operations such as +, -, *, /,%, ^ can also be performed to represent addition, subtraction, multiplication, division, redundancy, and multiplication, respectively.

(2) Format of command:

awk option'mode or condition {action}'file 1 file 2...
Awk-f script file file file 1 file 2...

2. awk editor

In Linux/UNIX systems, awk is a powerful editing tool that reads input text line by line, searches according to a specified matching pattern, formats and outputs qualified content or filters it. It can achieve quite complex text operations without interaction, and is widely used in Shell scripts to complete various automated configuration tasks.

(1) Working principle:

Reads the text line by line, separated by a space or tab key by default, saves the separated fields into built-in variables, and executes editing commands by mode or condition.

The sed command is often used to process a whole line, but awk prefers to divide a line into multiple "fields" before processing. The awk information is read in line by line, and the result of execution can be printed and displayed by the print function. In the process of using awk command, the logical operators "&&" and "|| represent" or "!" can be used to represent the field data."Not"; simple mathematical operations such as +, -, *, /,%, ^ can also be performed to represent addition, subtraction, multiplication, division, redundancy, and multiplication, respectively.

(2) Format of command:

awk  option   'Mode or condition {operation}'  File 1 File 2...
awk  -f   Script File File 1    File 2...

(3) The common built-in variables (available directly) for awk are as follows:

FS: Column delimiter. Specifies the field delimiter for each line of text, defaulting to a space or tab. Same as'-F'
NF: Number of fields in the row being processed.
NR: Line number (ordinal) of the row being processed
$0: The entire line of the row being processed.
$n: the nth field of the current processing row (column n)
FILENAME: The name of the file being processed.
RS: Line separator. When awk reads data from a file, it cuts the data into many records as defined by RS, while awk reads only one record at a time for processing. The default value is'\n'

(4) Examples:

1. Output text by line
Output everything

awk  '{print)'  testfile1         #Output everything
awk '{print $0}' testfile1        #Output everything

Output Specified Line Content

awk 'NR==1, NR==3{print}'  testfile1       #Output lines 1-3
awk  '(NR>=1) &&  (NR<=3)  {print}'  testfile1       #Output lines 1-3

awk  'NR==1 || NR==3 {print}'    testfile1       #Output Line 1, Line 3

Output odd, even lines

awk '(NR%2)==1{print}'  testfile1      #Output odd rows
awk '(NR%2)==0{print}'  testfile1       #Output even rows

The output begins with and ends with the contents of the line

awk '/^root/{print}'  /etc/passwd        #Output begins with root
awk '/nologin$/{print}'  /etc/passwd     #Output ends with nologin

Count the number of rows ending in

awk 'BEGIN {x=0};/\/bin\/bash$/{x++};END {print x}' /etc/passwd
#Count rows ending in/bin/bash, equal to
grep  -c "/bin/bash$"  /etc/passwd

BEGIN mode means that the action specified in BEGIN mode needs to be executed before the specified text is processed; awk then processes the specified text, and then executes the action specified in END mode. The END {} statement block is often placed in statements such as printing results.
2. Output text by field

awk -F ":" '{print $3}' /etc/passwd          #Output the third field in each line with ":"Split"

awk -F ":" '{print $1,$3}' /etc/passwd       #Output the first and third fields in each row (with ":"Split)

awk -F ":" '$3<5{print $1,$3}' /etc/passwd   #Output the first and third fields of rows with a value less than 5 for the third field

awk -F ":" '!($3<200){print}' /etc/passwd    #Output the contents of rows with a value of no less than 200 for the third field

awk 'BEGIN {FS=":"};{if ($3>=200){print}}'  /etc/passwd   #Processing BEGIN content (changing column separator to:) Printing text content (output if value of the third paragraph is greater than or equal to 200)

awk -F ":" '{max=($3>$4)?$3:$4;{print max}}' /etc/passwd
#($3>$4)? $3:$4 is a ternary operator. If the value of the third field is greater than the value of the fourth field, assign the value of the third field to max, otherwise assign the value of the fourth field to Max

awk -F ":" '{print NR,$0}' /etc/passwd        #Output line content and line number, not finished processing a record, NR value (line number of the currently processed line) plus 1

awk -F ":" '$7~"/bash"{print $1}' /etc/passwd    #Output is colon-separated and the first field of the row containing/bash in the 7th field

awk -F ":"  '($1~"root")&&(NF==7){print $1,$2}' /etc/passwd    #Output Fields 1 and 2 of rows with root in the first field and seven fields (NF: Number of fields for rows currently being processed)

awk -F ":" '($7!="/bin/bash")&&($7!="/sbin/nologin"){print}' /etc/passwd  #Output field 7 is not/bin/bash or all rows of/sbin/nologin

3. Call shell commands with pipe symbols and double quotes

echo $PATH | awk 'BEGIN{RS=":"};END{print NR}'  #Count the number of colon-delimited text paragraphs, END {} statement blocks, often put in statements such as printed results

awk -F ":" '/bash$/{print | "wc -l"}' /etc/passwd   
#Calling the wc-l command counts the number of users using bash (that is, the rows ending in bash), equal to
grep -c "bash$"  /etc/passwd

free -m | awk '/Mem:/ {print int($3/($3+$4)*100)"%"}'     #View the current percentage of memory usage (int refers to the character type, where it represents an integer, i.e. no decimal point)

top -b -n 1 | grep Cpu | awk -F ',' '{print $4}' | awk '{print $1}'   
#View the current cpu idle rate, (-b-n 1 means only one output)  
The whole command means: to dynamically output the results of a process ( top -b -n 1);Filter out Cpu That line(grep Cpu);Separated by commas, print out the fourth column ( awk -F ',' '{print $4}');Then print out the first value of the filtered fourth column ( awk '{print $1}')

date -d "$(awk -F "." '{print $1}' /proc/uptime) second ago" +"%F %H:%M:%S"
#Showing the last system restart is equivalent to uptime: second ago showing how many seconds ago, + "%F%H:%M:%S" is equivalent to +"%Y-%m-%d%H:%M:%S" time format

awk 'BEGIN {n=0 ; while ("w" | getline) n++ ; {print n-2}}'    
#Call the w command and use it to count the number of online users; the w command can get detailed information about the current online users; the getline is the row; and the n-2 lines are printed because the first two lines of the information w displays are not useful, so the first two lines are removed.
[root@localhost ~]#w
 15:41:01 up 19:50,  1 user,  load average: 0.00, 0.01, 0.05
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    192.168.200.1    11:19    5.00s  0.12s  0.01s w

awk 'BEGIN {"hostname" | getline ; {print $0}}'     #Call the hostname command to output the current hostname
----------------------------------------------------------------------------------------------------------------------------------------------------------------
When getline There is no redirector left or right.<"or"|"time, awk First read the first line, which is 1, then getline,So you get the second line below 1, which is 2, because getline After that, awk Will change the corresponding NF,NR,FNR and $0 And so on, so the $0 Instead of 1, the value of 2 is printed out.
When getline Redirector left and right"<"or"|"When, getline Acts on a directional input file that has just been opened and is not awk Read a line, just getline Read in, then getline Instead of interlacing, the first line of the file is returned.

seq 10 | awk '{getline; print $0)'         #You can get even rows
seq 10 | awk '{print $0; getline}'         #Odd rows can be derived

4. CPU utilization

cpu_us='top -b -n 1 | grep Cpu | awk '(print $2}'
cpu_sy='top -b -n 1 | grep Cpu | awk -F ','  '{print $2}'  | awk  '{print $1}'  
cpu_sum=$ ( ($cpu_us+$cpu_sy))
echo $cpu_sum

Tags: Linux regex perl

Posted on Tue, 14 Sep 2021 12:09:42 -0400 by varun8211