Chapter V common text processing tools

2. Common text processing tools

2.1 file content viewing command

2.1.1 viewing text file content

2.1.1.1 cat

cat can view the text content

Format:

cat [OPTION]... [FILE]...

Common options

-E: Show line terminator $
-A: Show all controls
-n: Number each line displayed
-b: Non blank line number
-s: Compress consecutive empty lines into one line

example:

[root@rocky8 ~]# cat win.txt 
a
b
c[root@rocky8 ~]# cat -A win.txt #-A view invisible characters 
a^M$
b^M$
c[root@rocky8 ~]#

[root@rocky8 ~]# cat >a.txt <<EOF
> a
> 
> 
> b
> 
> 
> c
> EOF
[root@rocky8 ~]# cat a.txt 
a


b


c
[root@rocky8 ~]# cat -n a.txt  #-n displays the line number
     1	a
     2	
     3	
     4	b
     5	
     6	
     7	c
[root@rocky8 ~]# cat -b a.txt  #-b blank lines do not display line numbers, and only non blank lines display line numbers
     1	a


     2	b


     3	c
[root@rocky8 ~]# cat -A a.txt 
a$
$
$
b$
$
$
c$
[root@rocky8 ~]# cat -s a.txt  #-s compresses a row of continuous empty rows
a

b

c
[root@rocky8 ~]# cat -As a.txt 
a$
$
b$
$
c$

example:

[root@rocky8 data]# vim fa.txt
a b
c 
d	b	c
[root@rocky8 ~]# cat -A fa.txt 
a b$
c $
d^Ib^Ic$
[root@rocky8 ~]# cat fa.txt 
a b
c 
d	b	c
[root@rocky8 ~]# cat fb.txt 
a
b
c
[root@rocky8 ~]# cat -A fb.txt 
a^M$
b^M$
c^M$
[root@rocky8 ~]# he
head     help     hexdump  
[root@rocky8 ~]# hexdump -C fb.txt 
00000000  61 0d 0a 62 0d 0a 63 0d  0a                       |a..b..c..|
00000009
[root@rocky8 ~]# file fb.txt 
fb.txt: ASCII text, with CRLF line terminators

2.1.1.2 nl

Displays the line number, which is equivalent to cat -b

[root@rocky8 ~]# nl a.txt 
     1	a
       
       
     2	b
       
       
     3	c
[root@rocky8 ~]# cat -b a.txt 
     1	a


     2	b


     3	c

2.1.1.3 tac

Reverse display text content

[root@rocky8 data]# seq 10 
1
2
3
4
5
6
7
8
9
10
[root@rocky8 data]# seq 10 |tac #tac displays different lines of files upside down
10
9
8
7
6
5
4
3
2
1

[root@rocky8 data]# tac
a
bb
ccc Press ctrl+d
ccc
bb
a

2.1.1.4 rev

Reverse the contents of the same line

[root@rocky8 data]# echo {a..z}
a b c d e f g h i j k l m n o p q r s t u v w x y z
[root@rocky8 data]# echo {a..z} | rev #rev writes the same line of files upside down
z y x w v u t s r q p o n m l k j i h g f e d c b a

2.1.2 viewing the contents of non text files

2.1.2.1 hexdump

example:

[root@rocky8 data]# hexdump -C -n 512 /dev/sda
00000000  eb 63 90 10 8e d0 bc 00  b0 b8 00 00 8e d8 8e c0  |.c..............|
00000010  fb be 00 7c bf 00 06 b9  00 02 f3 a4 ea 21 06 00  |...|.........!..|
00000020  00 be be 07 38 04 75 0b  83 c6 10 81 fe fe 07 75  |....8.u........u|
00000030  f3 eb 16 b4 02 b0 01 bb  00 7c b2 80 8a 74 01 8b  |.........|...t..|
00000040  4c 02 cd 13 ea 00 7c 00  00 eb fe 00 00 00 00 00  |L.....|.........|
00000050  00 00 00 00 00 00 00 00  00 00 00 80 01 00 00 00  |................|
00000060  00 00 00 00 ff fa 90 90  f6 c2 80 74 05 f6 c2 70  |...........t...p|
00000070  74 02 b2 80 ea 79 7c 00  00 31 c0 8e d8 8e d0 bc  |t....y|..1......|
00000080  00 20 fb a0 64 7c 3c ff  74 02 88 c2 52 be 05 7c  |. ..d|<.t...R..||
00000090  b4 41 bb aa 55 cd 13 5a  52 72 3d 81 fb 55 aa 75  |.A..U..ZRr=..U.u|
000000a0  37 83 e1 01 74 32 31 c0  89 44 04 40 88 44 ff 89  |7...t21..D.@.D..|
000000b0  44 02 c7 04 10 00 66 8b  1e 5c 7c 66 89 5c 08 66  |D.....f..\|f.\.f|
000000c0  8b 1e 60 7c 66 89 5c 0c  c7 44 06 00 70 b4 42 cd  |..`|f.\..D..p.B.|
000000d0  13 72 05 bb 00 70 eb 76  b4 08 cd 13 73 0d 5a 84  |.r...p.v....s.Z.|
000000e0  d2 0f 83 de 00 be 85 7d  e9 82 00 66 0f b6 c6 88  |.......}...f....|
000000f0  64 ff 40 66 89 44 04 0f  b6 d1 c1 e2 02 88 e8 88  |d.@f.D..........|
00000100  f4 40 89 44 08 0f b6 c2  c0 e8 02 66 89 04 66 a1  |.@.D.......f..f.|
00000110  60 7c 66 09 c0 75 4e 66  a1 5c 7c 66 31 d2 66 f7  |`|f..uNf.\|f1.f.|
00000120  34 88 d1 31 d2 66 f7 74  04 3b 44 08 7d 37 fe c1  |4..1.f.t.;D.}7..|
00000130  88 c5 30 c0 c1 e8 02 08  c1 88 d0 5a 88 c6 bb 00  |..0........Z....|
00000140  70 8e c3 31 db b8 01 02  cd 13 72 1e 8c c3 60 1e  |p..1......r...`.|
00000150  b9 00 01 8e db 31 f6 bf  00 80 8e c6 fc f3 a5 1f  |.....1..........|
00000160  61 ff 26 5a 7c be 80 7d  eb 03 be 8f 7d e8 34 00  |a.&Z|..}....}.4.|
00000170  be 94 7d e8 2e 00 cd 18  eb fe 47 52 55 42 20 00  |..}.......GRUB .|
00000180  47 65 6f 6d 00 48 61 72  64 20 44 69 73 6b 00 52  |Geom.Hard Disk.R|
00000190  65 61 64 00 20 45 72 72  6f 72 0d 0a 00 bb 01 00  |ead. Error......|
000001a0  b4 0e cd 10 ac 3c 00 75  f4 c3 00 00 00 00 00 00  |.....<.u........|
000001b0  00 00 00 00 00 00 00 00  7d 50 d7 43 00 00 80 04  |........}P.C....|
000001c0  01 04 83 fe c2 ff 00 08  00 00 00 00 20 00 00 fe  |............ ...|
000001d0  c2 ff 83 fe c2 ff 00 08  20 00 00 00 80 0c 00 fe  |........ .......|
000001e0  c2 ff 83 fe c2 ff 00 08  a0 0c 00 00 40 06 00 fe  |............@...|
000001f0  c2 ff 05 fe c2 ff 00 08  e0 12 00 f8 1f 06 55 aa  |..............U.|
00000200

[root@rocky8 data]# echo abc | hexdump -C
00000000  61 62 63 0a                                       |abc.|
00000004

[root@rocky8 data]# echo {a..z} | tr -d ' '|hexdump -C
00000000  61 62 63 64 65 66 67 68  69 6a 6b 6c 6d 6e 6f 70  |abcdefghijklmnop|
00000010  71 72 73 74 75 76 77 78  79 7a 0a                 |qrstuvwxyz.|
0000001b

2.1.2.2 od

od is dump files in octal and other formats

example:

[root@rocky8 data]# echo {a..z} | tr -d ' '|od -t x
0000000 64636261 68676665 6c6b6a69 706f6e6d
0000020 74737271 78777675 000a7a79
0000033

[root@rocky8 data]# echo {a..z} | tr -d ' '|od -t x1
0000000 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70
0000020 71 72 73 74 75 76 77 78 79 7a 0a
0000033

[root@rocky8 data]# echo {a..z} | tr -d ' '|od -t x1z
0000000 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70  >abcdefghijklmnop<
0000020 71 72 73 74 75 76 77 78 79 7a 0a                 >qrstuvwxyz.<
0000033

2.1.2.3 xxd

[root@rocky8 data]# echo {a..z} | tr -d ' '|xxd
00000000: 6162 6364 6566 6768 696a 6b6c 6d6e 6f70  abcdefghijklmnop
00000010: 7172 7374 7576 7778 797a 0a              qrstuvwxyz.

2.2 view file contents in pages

2.2.1 more

You can view files in pages, and you can cooperate with the pipeline to page the output information

format

more [OPTIONS...] FILE...

Options:

-d: Display page turning and exit tips

example:

[root@rocky8 data]# ls -R /etc | more
/etc:
adjtime
aliases
alternatives
anacrontab
at.deny
audit
authselect
bash_completion.d
bashrc
bindresvport.blacklist
binfmt.d
chkconfig.d
cifs-utils
cron.d
cron.daily
cron.deny
cron.hourly
cron.monthly
crontab
cron.weekly
crypto-policies
crypttab
csh.cshrc
csh.login
dbus-1
--More-- #q exit

[root@rocky8 data]# ls -R /etc | more -d
/etc:
adjtime
aliases
alternatives
anacrontab
at.deny
audit
authselect
bash_completion.d
bashrc
bindresvport.blacklist
binfmt.d
chkconfig.d
cifs-utils
cron.d
cron.daily
cron.deny
cron.hourly
cron.monthly
crontab
cron.weekly
crypto-policies
crypttab
csh.cshrc
csh.login
dbus-1
--More--[Press space to continue, 'q' to quit.]

2.2.2 less

Less can also realize paging to view files or STDIN output. The less command is the pager used by the man command

Useful commands for viewing include:

/Text drill down text
/?Text up search
n/N Skip to next or previous match

example:

[root@rocky8 data]# ls -R /etc |less
/etc:
adjtime
aliases
alternatives
anacrontab
at.deny
audit
authselect
bash_completion.d
bashrc
bindresvport.blacklist
binfmt.d
chkconfig.d
cifs-utils
cron.d
cron.daily
cron.deny
cron.hourly
cron.monthly
crontab
cron.weekly
crypto-policies
crypttab
csh.cshrc
csh.login
dbus-1
:

example:

#less cooperates with the pipeline to display the execution results of other commands in pages
[root@rocky8 data]# tree -d /etc |less
/etc
├── alternatives
├── audit
│   ├── plugins.d
│   └── rules.d
├── authselect
│   └── custom
├── bash_completion.d
├── binfmt.d
├── chkconfig.d
├── cifs-utils
├── cron.d
├── cron.daily
├── cron.hourly
├── cron.monthly
├── cron.weekly
├── crypto-policies
│   ├── back-ends
│   ├── local.d
│   ├── policies
│   │   └── modules
│   └── state
├── dbus-1
│   ├── session.d
│   └── system.d
├── default
: #q exit

2.3 display the content before or after the text

2.3.1 head

You can display the first line of a file or standard input

Format:

head [OPTION]... [FILE]...

Options:

-c # Specify before getting#byte
-n # Specify before getting#that 's ok
-# ditto

example:

[root@rocky8 data]# seq 100 |head #head displays the first 10 lines of the file by default
1
2
3
4
5
6
7
8
9
10 
[root@rocky8 data]# seq 100 |head -n 3 #-n specify how many rows
1
2
3
[root@rocky8 data]# echo {a..z} 
a b c d e f g h i j k l m n o p q r s t u v w x y z
[root@rocky8 data]# echo {a..z} | head -c 5 #-c takes the first few characters
a b c[root@rocky8 data]#

[root@rocky8 data]# cat /dev/urandom | tr -dc '[:alnum:]' | head -c10 #Take out the first 10 random characters
1HL73ArcyO

[root@rocky8 data]# cat /dev/urandom | tr -dc '[:alnum:]' | head -c10 | tee pass.txt | passwd --stdin raymond #Generate a random password with multiple passwords for the account
Changing password for user raymond.
passwd: all authentication tokens updated successfully.
[root@rocky8 data]# cat pass.txt 
KWENpwZmNs
[root@rocky8 data]# su - boss
Last login: Wed Oct  6 22:15:25 CST 2021 on pts/0
[boss@rocky8 ~]$ su - raymond
Password: 
Last login: Thu Oct  7 20:09:43 CST 2021 on pts/0
[raymond@rocky8 ~]$ exit
logout
[boss@rocky8 ~]$ exit
logout

example:

[root@rocky8 ~]# head -n 3 /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

[root@rocky8 ~]# head -3 /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

[root@rocky8 ~]# echo a I b | head -c4
a I

[root@centos8 ~]#cat /dev/urandom | tr -dc '[:alnum:]'| head -c10
G755MlZatW[root@centos8 ~]#cat /dev/urandom | tr -dc '[:alnum:]'| head -c10
ASsax6DeBz[root@centos8 ~]#cat /dev/urandom | tr -dc '[:alnum:]'| head -c10 | tee
pass.txt | passwd --stdin mage
Changing password for user mage.
passwd: all authentication tokens updated successfully.
[root@centos8 ~]#cat pass.txt
AGT952Essg[root@centos8 ~]#su - wang
[wang@centos8 ~]$su - mage
Password:

[root@rocky8 ~]# seq 10 |head -n 3 #Take the first 3
1
2
3
[root@rocky8 ~]# seq 10 |head -n -3 #Do not take the last 3
1
2
3
4
5
6
7

2.3.2 tail

tail, in contrast to head, looks at the reciprocal lines of the file or standard input

Format:

tail [OPTION]... [FILE]...

Options:

-c # Specify after acquisition#byte
-n # Specify after acquisition#that 's ok
-# ditto
-f Trace display file fd New additions,Common log monitoring, equivalent to --follow=descriptor,When the file is deleted, create a new file with the same name,You will not be able to continue tracking files
-F Trace file name, equivalent to--follow=name --retry,When the file is deleted, create a new file with the same name,You will be able to continue tracking files

example:

[root@rocky8 ~]# seq 20 | tail
11
12
13
14
15
16
17
18
19
20
[root@rocky8 ~]# seq 20 | tail -n 3
18
19
20
[root@rocky8 ~]# seq 20 | tail -n +3  #+3 means from the third line back
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[root@rocky8 ~]# seq 20 | tail -n -3 #-3 last 3
18
19
20

example:

[root@rocky8 ~]# tail -3 /var/log/messages 
Oct  7 20:13:37 rocky8 systemd[2857]: Reached target Default.
Oct  7 20:13:37 rocky8 systemd[2857]: Startup finished in 34ms.
Oct  7 20:13:37 rocky8 systemd[1]: Started User Manager for UID 0.

[root@rocky8 ~]# tail -f /var/log/messages  #tail -f trace log information
Oct  7 20:13:37 rocky8 systemd-logind[854]: New session 7 of user root.
Oct  7 20:13:37 rocky8 systemd[2857]: Reached target Timers.
Oct  7 20:13:37 rocky8 systemd[2857]: Reached target Paths.
Oct  7 20:13:37 rocky8 systemd[2857]: Starting D-Bus User Message Bus Socket.
Oct  7 20:13:37 rocky8 systemd[2857]: Listening on D-Bus User Message Bus Socket.
Oct  7 20:13:37 rocky8 systemd[2857]: Reached target Sockets.
Oct  7 20:13:37 rocky8 systemd[2857]: Reached target Basic System.
Oct  7 20:13:37 rocky8 systemd[2857]: Reached target Default.
Oct  7 20:13:37 rocky8 systemd[2857]: Startup finished in 34ms.
Oct  7 20:13:37 rocky8 systemd[1]: Started User Manager for UID 0.

#View only the latest logs
[root@rocky8 ~]# tail -fn0 /var/log/messages
[root@rocky8 ~]# tail -0f /var/log/messages

[root@rocky8 ~]# ifconfig | head -2 | tail -1
-bash: ifconfig: command not found #Install the net tools toolkit without ifconfig command
[root@rocky8 ~]# yum -y install net-tools
[root@rocky8 ~]# ifconfig | head -2 | tail -1
        inet 172.31.1.8  netmask 255.255.248.0  broadcast 172.31.7.255

#Select line 6
[root@rocky8 ~]# seq 20| head -n 6|tail -n1
6
[root@rocky8 ~]# seq 20| tail -n +6 |head -n1
6

2.4 extract text cut by column

The cut command can extract a specified column of text file or STDIN data

format

cut [OPTION]... [FILE]...

option

-d DELIMITER: Indicates the separator, default tab
-f FILEDS:
    #: The first#Fields, for example: 3
    #,#[,#]: discrete multiple fields, for example: 1,3,6
    #-#: multiple consecutive fields, for example: 1-6
    Mixed use: 1-3,7
-c Cut by character
--output-delimiter=STRING Specifies the output separator

example:

[root@rocky8 ~]# cd /data
[root@rocky8 data]# cp /etc/passwd .
[root@rocky8 data]# cut -d ":" -f 1,3-5 passwd #cut -d specifies the separator - f specifies which column to take out
root:0:0:root
bin:1:1:bin
daemon:2:2:daemon
adm:3:4:adm
lp:4:7:lp
sync:5:0:sync
shutdown:6:0:shutdown
halt:7:0:halt
mail:8:12:mail
operator:11:0:operator
games:12:100:games
ftp:14:50:FTP User
nobody:65534:65534:Kernel Overflow User
dbus:81:81:System message bus
systemd-coredump:999:997:systemd Core Dumper
systemd-resolve:193:193:systemd Resolver
tss:59:59:Account used for TPM access
polkitd:998:996:User for polkitd
unbound:997:994:Unbound DNS resolver
sssd:996:993:User for sssd
sshd:74:74:Privilege-separated SSH
postfix:89:89:
raymond:1000:1000:
boss:1001:1001:

[root@rocky8 data]# df
Filesystem     1K-blocks    Used Available Use% Mounted on
devtmpfs          382688       0    382688   0% /dev
tmpfs             400580       0    400580   0% /dev/shm
tmpfs             400580    5692    394888   2% /run
tmpfs             400580       0    400580   0% /sys/fs/cgroup
/dev/sda2      104806400 2320396 102486004   3% /
/dev/sda3       52403200  398416  52004784   1% /data
/dev/sda1        1038336  191796    846540  19% /boot
tmpfs              80116       0     80116   0% /run/user/0
[root@rocky8 data]# df |cut -c 43-46 #-c take by character
 Use
   0
   0
   2
   0
   3
   1
  19
   0

[root@rocky8 data]# df |tr -s " "
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 382688 0 382688 0% /dev
tmpfs 400580 0 400580 0% /dev/shm
tmpfs 400580 5692 394888 2% /run
tmpfs 400580 0 400580 0% /sys/fs/cgroup
/dev/sda2 104806400 2320396 102486004 3% /
/dev/sda3 52403200 398416 52004784 1% /data
/dev/sda1 1038336 191796 846540 19% /boot
tmpfs 80116 0 80116 0% /run/user/0
[root@rocky8 data]# df |tr -s " "|cut -d " " -f5
Use%
0%
0%
2%
0%
3%
1%
19%
0%
[root@rocky8 data]# df |tr -s " "|cut -d " " -f5 |tr -d %
Use
0
0
2
0
3
1
19
0
[root@rocky8 data]# df |tr -s " " % |cut -d% -f5
Use
0
0
2
0
3
1
19
0
[root@rocky8 data]# df |tr -s " " % |cut -d% -f5 |tail -n +2
0
0
2
0
3
1
19
0

example:

[root@rocky8 data]# ifconfig |head -n2 |tail -n1|cut -d" " -f10
172.31.1.8
[root@rocky8 data]# ifconfig |head -n2 |tail -n1|tr -s " " |cut -d " " -f3
172.31.1.8
[root@rocky8 data]# df | tr -s ' '|cut -d' ' -f5 |tr -dc "[0-9\n]"

0
0
2
0
3
1
19
0
[root@rocky8 data]# df | tr -s ' ' % |cut -d% -f5 |tr -d '[:alpha:]'

0
0
2
0
3
1
19
0
[root@rocky8 data]# df | cut -c44-46 |tr -d '[:alpha:]'

  0
  0
  2
  0
  3
  1
 19
  0
[root@rocky8 data]# cut -d: -f1,3,7 --output-delimiter="---" /etc/passwd
root---0---/bin/bash
bin---1---/sbin/nologin
daemon---2---/sbin/nologin
adm---3---/sbin/nologin
lp---4---/sbin/nologin
sync---5---/bin/sync
shutdown---6---/sbin/shutdown
halt---7---/sbin/halt
mail---8---/sbin/nologin
operator---11---/sbin/nologin
games---12---/sbin/nologin
ftp---14---/sbin/nologin
nobody---65534---/sbin/nologin
dbus---81---/sbin/nologin
systemd-coredump---999---/sbin/nologin
systemd-resolve---193---/sbin/nologin
tss---59---/sbin/nologin
polkitd---998---/sbin/nologin
unbound---997---/sbin/nologin
sssd---996---/sbin/nologin
sshd---74---/sbin/nologin
postfix---89---/sbin/nologin
raymond---1000---/bin/bash
boss---1001---/bin/bash

Example: get partition utilization

#Take partition utilization
[root@rocky8 data]# df|tr -s ' ' |cut -d' ' -f5 |tr -d %
Use
0
0
2
0
3
1
19
0

[root@rocky8 data]# df|tr -s ' ' '%'|cut -d% -f5
Use
0
0
2
0
3
1
19
0

[root@rocky8 data]# df |cut -c 44-46|tail -n +2
  0
  0
  2
  0
  3
  1
 19
  0

[root@rocky8 data]# df | tail -n +2|tr -s ' ' % |cut -d% -f5
0
0
2
0
3
1
19
0

[root@rocky8 data]# df | tail -n +2|tr -s ' ' |cut -d' ' -f5 |tr -d %
0
0
2
0
3
1
19
0

2.5 merge multiple files

paste merges the row numbers of multiple files into one row

format

paste [OPTION]... [FILE]...

Common options:

-d Separator: Specifies the separator, which is used by default TAB
-s : All lines are displayed in one line

example:

[root@rocky8 data]# seq 100 >c.txt
[root@rocky8 data]# paste -s c.txt >f.txt
[root@rocky8 data]# cat f.txt
1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18    19	20	21	22	23	24	25	26	27	28	29	30	31	32	33	34	35	36    37	38	39	40	41	42	43	44	45	46	47	48	49	50	51	52	53	54    55	56	57	58	59	60	61	62	63	64	65	66	67	68	69	70	71	72    73	74	75	76	77	78	79	80	81	82	83	84	85	86	87	88	89	90    91	92	93	94	95	96	97	98	99	100
[root@rocky8 data]# cat -A f.txt
1^I2^I3^I4^I5^I6^I7^I8^I9^I10^I11^I12^I13^I14^I15^I16^I17^I18^I19^I20^I21^I22^I23^I24^I25^I26^I27^I28^I29^I30^I31^I32^I33^I34^I35^I36^I37^I38^I39^I40^I41^I42^I43^I44^I45^I46^I47^I48^I49^I50^I51^I52^I53^I54^I55^I56^I57^I58^I59^I60^I61^I62^I63^I64^I65^I66^I67^I68^I69^I70^I71^I72^I73^I74^I75^I76^I77^I78^I79^I80^I81^I82^I83^I84^I85^I86^I87^I88^I89^I90^I91^I92^I93^I94^I95^I96^I97^I98^I99^I100$
[root@rocky8 data]# cat f.txt | tr '\t' '\n'

example:

[root@rocky8 data]# echo {a..h} | tr -s ' ' '\n' > alpha.log
[root@rocky8 data]# cat alpha.log
a
b
c
d
e
f
g
h
[root@rocky8 data]# seq 5 > seq.log
[root@rocky8 data]# cat seq.log
1
2
3
4
5

[root@rocky8 data]# cat alpha.log seq.log
a
b
c
d
e
f
g
h
1
2
3
4
5

[root@rocky8 data]# paste alpha.log seq.log
a	1
b	2
c	3
d	4
e	5
f	
g	
h	
[root@rocky8 data]# paste -d":" alpha.log seq.log
a:1
b:2
c:3
d:4
e:5
f:
g:
h:

[root@rocky8 data]# paste -s seq.log
1	2	3	4	5
[root@rocky8 data]# paste -s alpha.log
a	b	c	d	e	f	g	h
[root@rocky8 data]# paste -s alpha.log seq.log
a	b	c	d	e	f	g	h
1	2	3	4	5

[root@rocky8 data]# cat > title.txt <<EOF
> ceo
> cto
> coo
> EOF
[root@rocky8 data]# cat title.txt 
ceo
cto
coo

[root@rocky8 data]# cat > emp.txt <<EOF
> zhang
> wang
> li
> zhao
> EOF
[root@rocky8 data]# cat emp.txt 
zhang
wang
li
zhao
[root@rocky8 data]# paste title.txt emp.txt
ceo	zhang
cto	wang
coo	li
	zhao
[root@rocky8 data]# paste -s title.txt emp.txt
ceo	cto	coo
zhang	wang	li	zhao

[root@rocky8 data]# seq 100|paste -d + -s|bc
5050

2.6 tools for analyzing text

Text data statistics: wc
 Organize text: sort
 Comparison file: diff and patch

2.6.1 collect text statistics wc

The wc command can be used to count the total number of lines, words, bytes and characters in a file

You can make statistics on the data in the file or STDIN

Common options

-l Count rows only
-w Count only the total number of words
-c Only the total number of bytes is counted
-m Count only the total number of characters
-L Displays the length of the longest line in the file

example:

[root@rocky8 data]# wc c.txt
100 100 292 c.txt
#Line number word number section number
[root@rocky8 data]# ll c.txt
-rw-r--r-- 1 root root 292 Oct  7 20:34 c.txt
[root@rocky8 data]# wc -l c.txt #How many lines does wc -l display
100 c.txt
[root@rocky8 data]# cat /etc/passwd |wc -l
24
[root@rocky8 data]# df |tr -s " " % |cut -d% -f5 |tail -n +2 
0
0
2
0
3
1
19
0
[root@rocky8 data]# df |tr -s " " % |cut -d% -f5 |tail -n +2 |wc -l 
8

[root@rocky8 data]# cat > title.txt <<EOF
> ceo zhang
> coo wang
> cto li
> EOF
[root@rocky8 data]# cat > title1.txt <<EOF
> ceo zhang
> coo wang
> cto Lao Li
> EOF
[root@rocky8 data]# ll title.txt title1.txt 
-rw-r--r-- 1 root root 30 Oct  7 20:51 title1.txt
-rw-r--r-- 1 root root 26 Oct  7 20:50 title.txt
[root@rocky8 data]# wc title.txt 
 3  6 26 title.txt
[root@rocky8 data]# wc title1.txt 
 3  6 30 title1.txt

[root@rocky8 data]# wc -l title.txt 
3 title.txt
[root@rocky8 data]# cat title.txt | wc -l
3

[root@rocky8 data]# df | tail -n $(echo `df | wc -l`-1|bc)
devtmpfs          382688       0    382688   0% /dev
tmpfs             400580       0    400580   0% /dev/shm
tmpfs             400580    5692    394888   2% /run
tmpfs             400580       0    400580   0% /sys/fs/cgroup
/dev/sda2      104806400 2320316 102486084   3% /
/dev/sda3       52403200  398444  52004756   1% /data
/dev/sda1        1038336  191796    846540  19% /boot
tmpfs              80116       0     80116   0% /run/user/0

2.6.2 text sort

Display the sorted text in STDOUT without changing the original file

Format:

sort [options] file(s)

Common options

-r Perform reverse (top-down) grooming
-R Random sorting
-n Perform collation by number size
-f Option ignore( fold)Character case in string
-u Options (unique, unique),Merge duplicates, i.e. de duplication
-t c Option Use c As field delimiter
-k # Options are used in accordance with c Character delimited # Columns can be used multiple times

example:

[root@rocky8 data]# sort /etc/passwd #sort sorts by character by default
adm:x:3:4:adm:/var/adm:/sbin/nologin
bin:x:1:1:bin:/bin:/sbin/nologin
boss:x:1001:1001::/home/boss:/bin/bash
daemon:x:2:2:daemon:/sbin:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
halt:x:7:0:halt:/sbin:/sbin/halt
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
polkitd:x:998:996:User for polkitd:/:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
raymond:x:1000:1000::/home/raymond:/bin/bash
root:x:0:0:root:/root:/bin/bash
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
sssd:x:996:993:User for sssd:/:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin
systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin
tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin
unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin

[root@rocky8 data]# sort -t: -k3 /etc/passwd #-t specifies the separator - k specifies the column, which is arranged by character by default
root:x:0:0:root:/root:/bin/bash
raymond:x:1000:1000::/home/raymond:/bin/bash
boss:x:1001:1001::/home/boss:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
bin:x:1:1:bin:/bin:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
halt:x:7:0:halt:/sbin:/sbin/halt
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
sssd:x:996:993:User for sssd:/:/sbin/nologin
unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin
polkitd:x:998:996:User for polkitd:/:/sbin/nologin
systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin

[root@rocky8 data]# sort -t: -k3 -n /etc/passwd #-n in numerical order
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin
sssd:x:996:993:User for sssd:/:/sbin/nologin
unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin
polkitd:x:998:996:User for polkitd:/:/sbin/nologin
systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin
raymond:x:1000:1000::/home/raymond:/bin/bash
boss:x:1001:1001::/home/boss:/bin/bash
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin

[root@rocky8 data]# sort -t: -k3 -nr /etc/passwd #-r reverse order
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
boss:x:1001:1001::/home/boss:/bin/bash
raymond:x:1000:1000::/home/raymond:/bin/bash
systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin
polkitd:x:998:996:User for polkitd:/:/sbin/nologin
unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin
sssd:x:996:993:User for sssd:/:/sbin/nologin
systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
halt:x:7:0:halt:/sbin:/sbin/halt
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
sync:x:5:0:sync:/sbin:/bin/sync
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
bin:x:1:1:bin:/bin:/sbin/nologin
root:x:0:0:root:/root:/bin/bash

[root@rocky8 data]# df
Filesystem     1K-blocks    Used Available Use% Mounted on
devtmpfs          382688       0    382688   0% /dev
tmpfs             400580       0    400580   0% /dev/shm
tmpfs             400580    5692    394888   2% /run
tmpfs             400580       0    400580   0% /sys/fs/cgroup
/dev/sda2      104806400 2320316 102486084   3% /
/dev/sda3       52403200  398444  52004756   1% /data
/dev/sda1        1038336  191796    846540  19% /boot
tmpfs              80116       0     80116   0% /run/user/0
[root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5
Use
0
0
2
0
3
1
19
0
[root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5 |tail -n +2
0
0
2
0
3
1
19
0
[root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5 |tail -n +2 |sort -nr
19
3
2
1
0
0
0
0
[root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5 |tail -n +2 |sort -nr |head -1
19

[root@rocky8 data]# cat >aa.txt <<EOF
> a
> b
> a
> c
> a
> c
> b
> EOF
[root@rocky8 data]# sort aa.txt
a
a
a
b
b
c
c
[root@rocky8 data]# sort -u aa.txt #-u weight removal
a
b
c

example:

[root@rocky8 data]# cut -d: -f1,3 /etc/passwd|sort -t: -k2 -nr |head -n3
nobody:65534
boss:1001
raymond:1000

#Statistics log access
[root@rocky8 data]# cut -d" " -f1 access_log |sort -u|wc -l
201

Example: Statistics partition utilization

[root@rocky8 data]# df
Filesystem     1K-blocks    Used Available Use% Mounted on
devtmpfs          382688       0    382688   0% /dev
tmpfs             400580       0    400580   0% /dev/shm
tmpfs             400580    5692    394888   2% /run
tmpfs             400580       0    400580   0% /sys/fs/cgroup
/dev/sda2      104806400 2320316 102486084   3% /
/dev/sda3       52403200  407316  51995884   1% /data
/dev/sda1        1038336  191796    846540  19% /boot
tmpfs              80116       0     80116   0% /run/user/0

#View the highest partition utilization value
[root@rocky8 data]# df| tr -s ' ' '%'|cut -d% -f5|sort -nr|head -1
19

[root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort

0
0
0
0
1
19
2
3
[root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -n

0
0
0
0
1
2
3
19
[root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -n |tail -n1
19

[root@centos8 ~]#df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -nr
15
5
1
1
1
0
0
0

[root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -nr |head -1
19

Interview question: there are two files, a.txt and b.txt. Merge the two files and ensure that each number is unique when outputting

#a.txt is unique in this document
200
100
34556
23
...

#b.txt is unique in this document
123
43
200
3321
...

#This is to remove the duplicate lines after merging the two files without retaining them
100
345563
123
43
3321
...

2.6.3 weight removal uniq

The uniq command removes contiguous duplicate lines from the input

Format:

uniq [OPTION]... [FILE]...

Common options:

-c: Displays the number of repetitions per line
-d: Show only duplicate rows
-u: Show only rows that have not been repeated

uniq is often used with the sort command:

example:

[root@rocky8 data]# cat > aa.txt <<EOF
> a
> b
> a
> a
> c
> c
> a
> c
> b
> b
> EOF
[root@rocky8 data]# cat aa.txt 
a
b
a
a
c
c
a
c
b
b
[root@rocky8 data]# uniq aa.txt  #uniq removes adjacent duplicate characters
a
b
a
c
a
c
b
[root@rocky8 data]# uniq -c aa.txt #-c repeat several times
      1 a
      1 b
      2 a
      2 c
      1 a
      1 c
      2 b

[root@rocky8 data]# cat aa.txt 
a
b
a
a
c
c
a
c
b
b
[root@rocky8 data]# sort aa.txt 
a
a
a
a
b
b
b
c
c
c
[root@rocky8 data]# sort aa.txt | uniq  -c #Statistics are repeated several times
      4 a
      3 b
      3 c
[root@rocky8 data]#  sort aa.txt |uniq -c|sort -nr
      4 a
      3 c
      3 b
[root@rocky8 data]#  sort aa.txt |uniq -c|sort -nr | head -1
      4 a

example:

sort userlist.txt | uniq -c

Example: count the requests with the most log accesses

[root@rocky8 data]# cut -d" " -f1 access_log |sort |uniq -c|sort -nr |head -3
   4870 172.20.116.228
   3429 172.20.116.208
   2834 172.20.0.222

[root@10-9-24-182 ~]# lastb |tr -s ' ' |cut -d ' ' -f3 |sort |uniq -c |sort -nr |head -3
  34096 113.141.66.163
  24460 222.186.10.188
  16449 119.118.20.161

Example: remote host IP with the most concurrent connections

[root@rocky8 data]# ss -nt |tail -n +2 |tr -s ' ' : |cut -d: -f6 |sort |uniq -c |sort -nr |head -2
	7 10.0.0.1
	2 10.0.0.7

Example: take the same and different lines of two files

[root@rocky8 data]# cat > test1.txt <<EOF
> a
> b
> 1
> c
> EOF
[root@rocky8 data]# cat test1.txt
a
b
1
c

[root@rocky8 data]# cat > test2.txt <<EOF
> b
> e
> f
> c
> 1
> 2
> EOF
[root@rocky8 data]# cat test2.txt 
b
e
f
c
1
2

#Get the common line of the file
[root@rocky8 data]# cat test1.txt test2.txt | sort |uniq -d
1
b
c

#Get different lines of the file
[root@rocky8 data]# cat test1.txt test2.txt | sort |uniq -u
2
a
e
f

2.6.4 comparison documents

2.6.4.1 diff

The diff command compares the differences between two files. The output of the diff command can be saved in a file called "patch"

Common options

-u Option to output "unified"( unified)"diff Format file, which is most suitable for patch files

example:

[root@rocky8 data]# cat > f1.txt <<EOF
> zhang
> wang
> li
> zhao
> EOF
[root@rocky8 data]# cat f1.txt 
zhang
wang
li
zhao
[root@rocky8 data]# cat > f2.txt <<EOF
> zhangliang
> wangsir
> li
> zhao
> gao
> EOF
[root@rocky8 data]# cat f2.txt 
zhangliang
wangsir
li
zhao
gao

[root@rocky8 data]# diff f1.txt f2.txt 
1,2c1,2
< zhang
< wang
---
> zhangliang
> wangsir
4a5
> gao

[root@rocky8 data]# diff -u f1.txt f2.txt 
--- f1.txt	2021-10-07 21:30:15.134764613 +0800
+++ f2.txt	2021-10-07 21:31:20.962768219 +0800
@@ -1,4 +1,5 @@
-zhang
-wang
+zhangliang
+wangsir
 li
 zhao
+gao

[root@centos8 ~]#diff -u f1.txt f2.txt > f.patch
[root@rocky8 data]# diff -u f1.txt f2.txt > f.patch
[root@rocky8 data]# rm -f f2.txt
[root@rocky8 data]# patch -b f1.txt f.patch #Here, the recovery will restore f2.txt to f1.txt. Before using - b recovery, back up f1.txt to f1.txt.orig. The recovered file f1.txt is the original f2.txt file
patching file f1.txt
[root@rocky8 data]# cat f1.txt
zhangliang
wangsir
li
zhao
gao
[root@rocky8 data]# cat f1.txt.orig 
zhang
wang
li
zhao

2.6.4.2 patch

patch copies changes made in other files (use with caution)

Common options:

-b Option to automatically back up changed files

example:

diff -u foo.conf foo2.conf > foo.patch
patch -b foo.conf foo.patch

2.7 practice

1. Find the IPv4 address of the local machine in the ifconfig "network card name" command result
2. Find out the maximum percentage value of partition space utilization
3. Find out the user name, UID and shell type of the maximum user UID
4. Find out the permissions of / tmp and display them in digital form
5. Count the number of connections of each remote host IP currently connected to the local machine, and sort by the largest to the smallest

Tags: Linux Operation & Maintenance architecture DevOps Cloud Native

Posted on Wed, 20 Oct 2021 14:52:01 -0400 by johany