2. Common text processing tools
2.1 file content viewing command
2.1.1 viewing text file content
2.1.1.1 cat
cat can view the text content
Format:
cat [OPTION]... [FILE]...
Common options
-E: Show line terminator $ -A: Show all controls -n: Number each line displayed -b: Non blank line number -s: Compress consecutive empty lines into one line
example:
[root@rocky8 ~]# cat win.txt a b c[root@rocky8 ~]# cat -A win.txt #-A view invisible characters a^M$ b^M$ c[root@rocky8 ~]# [root@rocky8 ~]# cat >a.txt <<EOF > a > > > b > > > c > EOF [root@rocky8 ~]# cat a.txt a b c [root@rocky8 ~]# cat -n a.txt #-n displays the line number 1 a 2 3 4 b 5 6 7 c [root@rocky8 ~]# cat -b a.txt #-b blank lines do not display line numbers, and only non blank lines display line numbers 1 a 2 b 3 c [root@rocky8 ~]# cat -A a.txt a$ $ $ b$ $ $ c$ [root@rocky8 ~]# cat -s a.txt #-s compresses a row of continuous empty rows a b c [root@rocky8 ~]# cat -As a.txt a$ $ b$ $ c$
example:
[root@rocky8 data]# vim fa.txt a b c d b c [root@rocky8 ~]# cat -A fa.txt a b$ c $ d^Ib^Ic$ [root@rocky8 ~]# cat fa.txt a b c d b c [root@rocky8 ~]# cat fb.txt a b c [root@rocky8 ~]# cat -A fb.txt a^M$ b^M$ c^M$ [root@rocky8 ~]# he head help hexdump [root@rocky8 ~]# hexdump -C fb.txt 00000000 61 0d 0a 62 0d 0a 63 0d 0a |a..b..c..| 00000009 [root@rocky8 ~]# file fb.txt fb.txt: ASCII text, with CRLF line terminators
2.1.1.2 nl
Displays the line number, which is equivalent to cat -b
[root@rocky8 ~]# nl a.txt 1 a 2 b 3 c [root@rocky8 ~]# cat -b a.txt 1 a 2 b 3 c
2.1.1.3 tac
Reverse display text content
[root@rocky8 data]# seq 10 1 2 3 4 5 6 7 8 9 10 [root@rocky8 data]# seq 10 |tac #tac displays different lines of files upside down 10 9 8 7 6 5 4 3 2 1 [root@rocky8 data]# tac a bb ccc Press ctrl+d ccc bb a
2.1.1.4 rev
Reverse the contents of the same line
[root@rocky8 data]# echo {a..z} a b c d e f g h i j k l m n o p q r s t u v w x y z [root@rocky8 data]# echo {a..z} | rev #rev writes the same line of files upside down z y x w v u t s r q p o n m l k j i h g f e d c b a
2.1.2 viewing the contents of non text files
2.1.2.1 hexdump
example:
[root@rocky8 data]# hexdump -C -n 512 /dev/sda 00000000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0 |.c..............| 00000010 fb be 00 7c bf 00 06 b9 00 02 f3 a4 ea 21 06 00 |...|.........!..| 00000020 00 be be 07 38 04 75 0b 83 c6 10 81 fe fe 07 75 |....8.u........u| 00000030 f3 eb 16 b4 02 b0 01 bb 00 7c b2 80 8a 74 01 8b |.........|...t..| 00000040 4c 02 cd 13 ea 00 7c 00 00 eb fe 00 00 00 00 00 |L.....|.........| 00000050 00 00 00 00 00 00 00 00 00 00 00 80 01 00 00 00 |................| 00000060 00 00 00 00 ff fa 90 90 f6 c2 80 74 05 f6 c2 70 |...........t...p| 00000070 74 02 b2 80 ea 79 7c 00 00 31 c0 8e d8 8e d0 bc |t....y|..1......| 00000080 00 20 fb a0 64 7c 3c ff 74 02 88 c2 52 be 05 7c |. ..d|<.t...R..|| 00000090 b4 41 bb aa 55 cd 13 5a 52 72 3d 81 fb 55 aa 75 |.A..U..ZRr=..U.u| 000000a0 37 83 e1 01 74 32 31 c0 89 44 04 40 88 44 ff 89 |7...t21..D.@.D..| 000000b0 44 02 c7 04 10 00 66 8b 1e 5c 7c 66 89 5c 08 66 |D.....f..\|f.\.f| 000000c0 8b 1e 60 7c 66 89 5c 0c c7 44 06 00 70 b4 42 cd |..`|f.\..D..p.B.| 000000d0 13 72 05 bb 00 70 eb 76 b4 08 cd 13 73 0d 5a 84 |.r...p.v....s.Z.| 000000e0 d2 0f 83 de 00 be 85 7d e9 82 00 66 0f b6 c6 88 |.......}...f....| 000000f0 64 ff 40 66 89 44 04 0f b6 d1 c1 e2 02 88 e8 88 |d.@f.D..........| 00000100 f4 40 89 44 08 0f b6 c2 c0 e8 02 66 89 04 66 a1 |.@.D.......f..f.| 00000110 60 7c 66 09 c0 75 4e 66 a1 5c 7c 66 31 d2 66 f7 |`|f..uNf.\|f1.f.| 00000120 34 88 d1 31 d2 66 f7 74 04 3b 44 08 7d 37 fe c1 |4..1.f.t.;D.}7..| 00000130 88 c5 30 c0 c1 e8 02 08 c1 88 d0 5a 88 c6 bb 00 |..0........Z....| 00000140 70 8e c3 31 db b8 01 02 cd 13 72 1e 8c c3 60 1e |p..1......r...`.| 00000150 b9 00 01 8e db 31 f6 bf 00 80 8e c6 fc f3 a5 1f |.....1..........| 00000160 61 ff 26 5a 7c be 80 7d eb 03 be 8f 7d e8 34 00 |a.&Z|..}....}.4.| 00000170 be 94 7d e8 2e 00 cd 18 eb fe 47 52 55 42 20 00 |..}.......GRUB .| 00000180 47 65 6f 6d 00 48 61 72 64 20 44 69 73 6b 00 52 |Geom.Hard Disk.R| 00000190 65 61 64 00 20 45 72 72 6f 72 0d 0a 00 bb 01 00 |ead. Error......| 000001a0 b4 0e cd 10 ac 3c 00 75 f4 c3 00 00 00 00 00 00 |.....<.u........| 000001b0 00 00 00 00 00 00 00 00 7d 50 d7 43 00 00 80 04 |........}P.C....| 000001c0 01 04 83 fe c2 ff 00 08 00 00 00 00 20 00 00 fe |............ ...| 000001d0 c2 ff 83 fe c2 ff 00 08 20 00 00 00 80 0c 00 fe |........ .......| 000001e0 c2 ff 83 fe c2 ff 00 08 a0 0c 00 00 40 06 00 fe |............@...| 000001f0 c2 ff 05 fe c2 ff 00 08 e0 12 00 f8 1f 06 55 aa |..............U.| 00000200 [root@rocky8 data]# echo abc | hexdump -C 00000000 61 62 63 0a |abc.| 00000004 [root@rocky8 data]# echo {a..z} | tr -d ' '|hexdump -C 00000000 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 |abcdefghijklmnop| 00000010 71 72 73 74 75 76 77 78 79 7a 0a |qrstuvwxyz.| 0000001b
2.1.2.2 od
od is dump files in octal and other formats
example:
[root@rocky8 data]# echo {a..z} | tr -d ' '|od -t x 0000000 64636261 68676665 6c6b6a69 706f6e6d 0000020 74737271 78777675 000a7a79 0000033 [root@rocky8 data]# echo {a..z} | tr -d ' '|od -t x1 0000000 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 0000020 71 72 73 74 75 76 77 78 79 7a 0a 0000033 [root@rocky8 data]# echo {a..z} | tr -d ' '|od -t x1z 0000000 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 >abcdefghijklmnop< 0000020 71 72 73 74 75 76 77 78 79 7a 0a >qrstuvwxyz.< 0000033
2.1.2.3 xxd
[root@rocky8 data]# echo {a..z} | tr -d ' '|xxd 00000000: 6162 6364 6566 6768 696a 6b6c 6d6e 6f70 abcdefghijklmnop 00000010: 7172 7374 7576 7778 797a 0a qrstuvwxyz.
2.2 view file contents in pages
2.2.1 more
You can view files in pages, and you can cooperate with the pipeline to page the output information
format
more [OPTIONS...] FILE...
Options:
-d: Display page turning and exit tips
example:
[root@rocky8 data]# ls -R /etc | more /etc: adjtime aliases alternatives anacrontab at.deny audit authselect bash_completion.d bashrc bindresvport.blacklist binfmt.d chkconfig.d cifs-utils cron.d cron.daily cron.deny cron.hourly cron.monthly crontab cron.weekly crypto-policies crypttab csh.cshrc csh.login dbus-1 --More-- #q exit [root@rocky8 data]# ls -R /etc | more -d /etc: adjtime aliases alternatives anacrontab at.deny audit authselect bash_completion.d bashrc bindresvport.blacklist binfmt.d chkconfig.d cifs-utils cron.d cron.daily cron.deny cron.hourly cron.monthly crontab cron.weekly crypto-policies crypttab csh.cshrc csh.login dbus-1 --More--[Press space to continue, 'q' to quit.]
2.2.2 less
Less can also realize paging to view files or STDIN output. The less command is the pager used by the man command
Useful commands for viewing include:
/Text drill down text /?Text up search n/N Skip to next or previous match
example:
[root@rocky8 data]# ls -R /etc |less /etc: adjtime aliases alternatives anacrontab at.deny audit authselect bash_completion.d bashrc bindresvport.blacklist binfmt.d chkconfig.d cifs-utils cron.d cron.daily cron.deny cron.hourly cron.monthly crontab cron.weekly crypto-policies crypttab csh.cshrc csh.login dbus-1 :
example:
#less cooperates with the pipeline to display the execution results of other commands in pages [root@rocky8 data]# tree -d /etc |less /etc ├── alternatives ├── audit │ ├── plugins.d │ └── rules.d ├── authselect │ └── custom ├── bash_completion.d ├── binfmt.d ├── chkconfig.d ├── cifs-utils ├── cron.d ├── cron.daily ├── cron.hourly ├── cron.monthly ├── cron.weekly ├── crypto-policies │ ├── back-ends │ ├── local.d │ ├── policies │ │ └── modules │ └── state ├── dbus-1 │ ├── session.d │ └── system.d ├── default : #q exit
2.3 display the content before or after the text
2.3.1 head
You can display the first line of a file or standard input
Format:
head [OPTION]... [FILE]...
Options:
-c # Specify before getting#byte -n # Specify before getting#that 's ok -# ditto
example:
[root@rocky8 data]# seq 100 |head #head displays the first 10 lines of the file by default 1 2 3 4 5 6 7 8 9 10 [root@rocky8 data]# seq 100 |head -n 3 #-n specify how many rows 1 2 3 [root@rocky8 data]# echo {a..z} a b c d e f g h i j k l m n o p q r s t u v w x y z [root@rocky8 data]# echo {a..z} | head -c 5 #-c takes the first few characters a b c[root@rocky8 data]# [root@rocky8 data]# cat /dev/urandom | tr -dc '[:alnum:]' | head -c10 #Take out the first 10 random characters 1HL73ArcyO [root@rocky8 data]# cat /dev/urandom | tr -dc '[:alnum:]' | head -c10 | tee pass.txt | passwd --stdin raymond #Generate a random password with multiple passwords for the account Changing password for user raymond. passwd: all authentication tokens updated successfully. [root@rocky8 data]# cat pass.txt KWENpwZmNs [root@rocky8 data]# su - boss Last login: Wed Oct 6 22:15:25 CST 2021 on pts/0 [boss@rocky8 ~]$ su - raymond Password: Last login: Thu Oct 7 20:09:43 CST 2021 on pts/0 [raymond@rocky8 ~]$ exit logout [boss@rocky8 ~]$ exit logout
example:
[root@rocky8 ~]# head -n 3 /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin [root@rocky8 ~]# head -3 /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin [root@rocky8 ~]# echo a I b | head -c4 a I [root@centos8 ~]#cat /dev/urandom | tr -dc '[:alnum:]'| head -c10 G755MlZatW[root@centos8 ~]#cat /dev/urandom | tr -dc '[:alnum:]'| head -c10 ASsax6DeBz[root@centos8 ~]#cat /dev/urandom | tr -dc '[:alnum:]'| head -c10 | tee pass.txt | passwd --stdin mage Changing password for user mage. passwd: all authentication tokens updated successfully. [root@centos8 ~]#cat pass.txt AGT952Essg[root@centos8 ~]#su - wang [wang@centos8 ~]$su - mage Password: [root@rocky8 ~]# seq 10 |head -n 3 #Take the first 3 1 2 3 [root@rocky8 ~]# seq 10 |head -n -3 #Do not take the last 3 1 2 3 4 5 6 7
2.3.2 tail
tail, in contrast to head, looks at the reciprocal lines of the file or standard input
Format:
tail [OPTION]... [FILE]...
Options:
-c # Specify after acquisition#byte -n # Specify after acquisition#that 's ok -# ditto -f Trace display file fd New additions,Common log monitoring, equivalent to --follow=descriptor,When the file is deleted, create a new file with the same name,You will not be able to continue tracking files -F Trace file name, equivalent to--follow=name --retry,When the file is deleted, create a new file with the same name,You will be able to continue tracking files
example:
[root@rocky8 ~]# seq 20 | tail 11 12 13 14 15 16 17 18 19 20 [root@rocky8 ~]# seq 20 | tail -n 3 18 19 20 [root@rocky8 ~]# seq 20 | tail -n +3 #+3 means from the third line back 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [root@rocky8 ~]# seq 20 | tail -n -3 #-3 last 3 18 19 20
example:
[root@rocky8 ~]# tail -3 /var/log/messages Oct 7 20:13:37 rocky8 systemd[2857]: Reached target Default. Oct 7 20:13:37 rocky8 systemd[2857]: Startup finished in 34ms. Oct 7 20:13:37 rocky8 systemd[1]: Started User Manager for UID 0. [root@rocky8 ~]# tail -f /var/log/messages #tail -f trace log information Oct 7 20:13:37 rocky8 systemd-logind[854]: New session 7 of user root. Oct 7 20:13:37 rocky8 systemd[2857]: Reached target Timers. Oct 7 20:13:37 rocky8 systemd[2857]: Reached target Paths. Oct 7 20:13:37 rocky8 systemd[2857]: Starting D-Bus User Message Bus Socket. Oct 7 20:13:37 rocky8 systemd[2857]: Listening on D-Bus User Message Bus Socket. Oct 7 20:13:37 rocky8 systemd[2857]: Reached target Sockets. Oct 7 20:13:37 rocky8 systemd[2857]: Reached target Basic System. Oct 7 20:13:37 rocky8 systemd[2857]: Reached target Default. Oct 7 20:13:37 rocky8 systemd[2857]: Startup finished in 34ms. Oct 7 20:13:37 rocky8 systemd[1]: Started User Manager for UID 0. #View only the latest logs [root@rocky8 ~]# tail -fn0 /var/log/messages [root@rocky8 ~]# tail -0f /var/log/messages [root@rocky8 ~]# ifconfig | head -2 | tail -1 -bash: ifconfig: command not found #Install the net tools toolkit without ifconfig command [root@rocky8 ~]# yum -y install net-tools [root@rocky8 ~]# ifconfig | head -2 | tail -1 inet 172.31.1.8 netmask 255.255.248.0 broadcast 172.31.7.255 #Select line 6 [root@rocky8 ~]# seq 20| head -n 6|tail -n1 6 [root@rocky8 ~]# seq 20| tail -n +6 |head -n1 6
2.4 extract text cut by column
The cut command can extract a specified column of text file or STDIN data
format
cut [OPTION]... [FILE]...
option
-d DELIMITER: Indicates the separator, default tab -f FILEDS: #: The first#Fields, for example: 3 #,#[,#]: discrete multiple fields, for example: 1,3,6 #-#: multiple consecutive fields, for example: 1-6 Mixed use: 1-3,7 -c Cut by character --output-delimiter=STRING Specifies the output separator
example:
[root@rocky8 ~]# cd /data [root@rocky8 data]# cp /etc/passwd . [root@rocky8 data]# cut -d ":" -f 1,3-5 passwd #cut -d specifies the separator - f specifies which column to take out root:0:0:root bin:1:1:bin daemon:2:2:daemon adm:3:4:adm lp:4:7:lp sync:5:0:sync shutdown:6:0:shutdown halt:7:0:halt mail:8:12:mail operator:11:0:operator games:12:100:games ftp:14:50:FTP User nobody:65534:65534:Kernel Overflow User dbus:81:81:System message bus systemd-coredump:999:997:systemd Core Dumper systemd-resolve:193:193:systemd Resolver tss:59:59:Account used for TPM access polkitd:998:996:User for polkitd unbound:997:994:Unbound DNS resolver sssd:996:993:User for sssd sshd:74:74:Privilege-separated SSH postfix:89:89: raymond:1000:1000: boss:1001:1001: [root@rocky8 data]# df Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 382688 0 382688 0% /dev tmpfs 400580 0 400580 0% /dev/shm tmpfs 400580 5692 394888 2% /run tmpfs 400580 0 400580 0% /sys/fs/cgroup /dev/sda2 104806400 2320396 102486004 3% / /dev/sda3 52403200 398416 52004784 1% /data /dev/sda1 1038336 191796 846540 19% /boot tmpfs 80116 0 80116 0% /run/user/0 [root@rocky8 data]# df |cut -c 43-46 #-c take by character Use 0 0 2 0 3 1 19 0 [root@rocky8 data]# df |tr -s " " Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 382688 0 382688 0% /dev tmpfs 400580 0 400580 0% /dev/shm tmpfs 400580 5692 394888 2% /run tmpfs 400580 0 400580 0% /sys/fs/cgroup /dev/sda2 104806400 2320396 102486004 3% / /dev/sda3 52403200 398416 52004784 1% /data /dev/sda1 1038336 191796 846540 19% /boot tmpfs 80116 0 80116 0% /run/user/0 [root@rocky8 data]# df |tr -s " "|cut -d " " -f5 Use% 0% 0% 2% 0% 3% 1% 19% 0% [root@rocky8 data]# df |tr -s " "|cut -d " " -f5 |tr -d % Use 0 0 2 0 3 1 19 0 [root@rocky8 data]# df |tr -s " " % |cut -d% -f5 Use 0 0 2 0 3 1 19 0 [root@rocky8 data]# df |tr -s " " % |cut -d% -f5 |tail -n +2 0 0 2 0 3 1 19 0
example:
[root@rocky8 data]# ifconfig |head -n2 |tail -n1|cut -d" " -f10 172.31.1.8 [root@rocky8 data]# ifconfig |head -n2 |tail -n1|tr -s " " |cut -d " " -f3 172.31.1.8 [root@rocky8 data]# df | tr -s ' '|cut -d' ' -f5 |tr -dc "[0-9\n]" 0 0 2 0 3 1 19 0 [root@rocky8 data]# df | tr -s ' ' % |cut -d% -f5 |tr -d '[:alpha:]' 0 0 2 0 3 1 19 0 [root@rocky8 data]# df | cut -c44-46 |tr -d '[:alpha:]' 0 0 2 0 3 1 19 0 [root@rocky8 data]# cut -d: -f1,3,7 --output-delimiter="---" /etc/passwd root---0---/bin/bash bin---1---/sbin/nologin daemon---2---/sbin/nologin adm---3---/sbin/nologin lp---4---/sbin/nologin sync---5---/bin/sync shutdown---6---/sbin/shutdown halt---7---/sbin/halt mail---8---/sbin/nologin operator---11---/sbin/nologin games---12---/sbin/nologin ftp---14---/sbin/nologin nobody---65534---/sbin/nologin dbus---81---/sbin/nologin systemd-coredump---999---/sbin/nologin systemd-resolve---193---/sbin/nologin tss---59---/sbin/nologin polkitd---998---/sbin/nologin unbound---997---/sbin/nologin sssd---996---/sbin/nologin sshd---74---/sbin/nologin postfix---89---/sbin/nologin raymond---1000---/bin/bash boss---1001---/bin/bash
Example: get partition utilization
#Take partition utilization [root@rocky8 data]# df|tr -s ' ' |cut -d' ' -f5 |tr -d % Use 0 0 2 0 3 1 19 0 [root@rocky8 data]# df|tr -s ' ' '%'|cut -d% -f5 Use 0 0 2 0 3 1 19 0 [root@rocky8 data]# df |cut -c 44-46|tail -n +2 0 0 2 0 3 1 19 0 [root@rocky8 data]# df | tail -n +2|tr -s ' ' % |cut -d% -f5 0 0 2 0 3 1 19 0 [root@rocky8 data]# df | tail -n +2|tr -s ' ' |cut -d' ' -f5 |tr -d % 0 0 2 0 3 1 19 0
2.5 merge multiple files
paste merges the row numbers of multiple files into one row
format
paste [OPTION]... [FILE]...
Common options:
-d Separator: Specifies the separator, which is used by default TAB -s : All lines are displayed in one line
example:
[root@rocky8 data]# seq 100 >c.txt [root@rocky8 data]# paste -s c.txt >f.txt [root@rocky8 data]# cat f.txt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 [root@rocky8 data]# cat -A f.txt 1^I2^I3^I4^I5^I6^I7^I8^I9^I10^I11^I12^I13^I14^I15^I16^I17^I18^I19^I20^I21^I22^I23^I24^I25^I26^I27^I28^I29^I30^I31^I32^I33^I34^I35^I36^I37^I38^I39^I40^I41^I42^I43^I44^I45^I46^I47^I48^I49^I50^I51^I52^I53^I54^I55^I56^I57^I58^I59^I60^I61^I62^I63^I64^I65^I66^I67^I68^I69^I70^I71^I72^I73^I74^I75^I76^I77^I78^I79^I80^I81^I82^I83^I84^I85^I86^I87^I88^I89^I90^I91^I92^I93^I94^I95^I96^I97^I98^I99^I100$ [root@rocky8 data]# cat f.txt | tr '\t' '\n'
example:
[root@rocky8 data]# echo {a..h} | tr -s ' ' '\n' > alpha.log [root@rocky8 data]# cat alpha.log a b c d e f g h [root@rocky8 data]# seq 5 > seq.log [root@rocky8 data]# cat seq.log 1 2 3 4 5 [root@rocky8 data]# cat alpha.log seq.log a b c d e f g h 1 2 3 4 5 [root@rocky8 data]# paste alpha.log seq.log a 1 b 2 c 3 d 4 e 5 f g h [root@rocky8 data]# paste -d":" alpha.log seq.log a:1 b:2 c:3 d:4 e:5 f: g: h: [root@rocky8 data]# paste -s seq.log 1 2 3 4 5 [root@rocky8 data]# paste -s alpha.log a b c d e f g h [root@rocky8 data]# paste -s alpha.log seq.log a b c d e f g h 1 2 3 4 5 [root@rocky8 data]# cat > title.txt <<EOF > ceo > cto > coo > EOF [root@rocky8 data]# cat title.txt ceo cto coo [root@rocky8 data]# cat > emp.txt <<EOF > zhang > wang > li > zhao > EOF [root@rocky8 data]# cat emp.txt zhang wang li zhao [root@rocky8 data]# paste title.txt emp.txt ceo zhang cto wang coo li zhao [root@rocky8 data]# paste -s title.txt emp.txt ceo cto coo zhang wang li zhao [root@rocky8 data]# seq 100|paste -d + -s|bc 5050
2.6 tools for analyzing text
Text data statistics: wc Organize text: sort Comparison file: diff and patch
2.6.1 collect text statistics wc
The wc command can be used to count the total number of lines, words, bytes and characters in a file
You can make statistics on the data in the file or STDIN
Common options
-l Count rows only -w Count only the total number of words -c Only the total number of bytes is counted -m Count only the total number of characters -L Displays the length of the longest line in the file
example:
[root@rocky8 data]# wc c.txt 100 100 292 c.txt #Line number word number section number [root@rocky8 data]# ll c.txt -rw-r--r-- 1 root root 292 Oct 7 20:34 c.txt [root@rocky8 data]# wc -l c.txt #How many lines does wc -l display 100 c.txt [root@rocky8 data]# cat /etc/passwd |wc -l 24 [root@rocky8 data]# df |tr -s " " % |cut -d% -f5 |tail -n +2 0 0 2 0 3 1 19 0 [root@rocky8 data]# df |tr -s " " % |cut -d% -f5 |tail -n +2 |wc -l 8 [root@rocky8 data]# cat > title.txt <<EOF > ceo zhang > coo wang > cto li > EOF [root@rocky8 data]# cat > title1.txt <<EOF > ceo zhang > coo wang > cto Lao Li > EOF [root@rocky8 data]# ll title.txt title1.txt -rw-r--r-- 1 root root 30 Oct 7 20:51 title1.txt -rw-r--r-- 1 root root 26 Oct 7 20:50 title.txt [root@rocky8 data]# wc title.txt 3 6 26 title.txt [root@rocky8 data]# wc title1.txt 3 6 30 title1.txt [root@rocky8 data]# wc -l title.txt 3 title.txt [root@rocky8 data]# cat title.txt | wc -l 3 [root@rocky8 data]# df | tail -n $(echo `df | wc -l`-1|bc) devtmpfs 382688 0 382688 0% /dev tmpfs 400580 0 400580 0% /dev/shm tmpfs 400580 5692 394888 2% /run tmpfs 400580 0 400580 0% /sys/fs/cgroup /dev/sda2 104806400 2320316 102486084 3% / /dev/sda3 52403200 398444 52004756 1% /data /dev/sda1 1038336 191796 846540 19% /boot tmpfs 80116 0 80116 0% /run/user/0
2.6.2 text sort
Display the sorted text in STDOUT without changing the original file
Format:
sort [options] file(s)
Common options
-r Perform reverse (top-down) grooming -R Random sorting -n Perform collation by number size -f Option ignore( fold)Character case in string -u Options (unique, unique),Merge duplicates, i.e. de duplication -t c Option Use c As field delimiter -k # Options are used in accordance with c Character delimited # Columns can be used multiple times
example:
[root@rocky8 data]# sort /etc/passwd #sort sorts by character by default adm:x:3:4:adm:/var/adm:/sbin/nologin bin:x:1:1:bin:/bin:/sbin/nologin boss:x:1001:1001::/home/boss:/bin/bash daemon:x:2:2:daemon:/sbin:/sbin/nologin dbus:x:81:81:System message bus:/:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin halt:x:7:0:halt:/sbin:/sbin/halt lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin mail:x:8:12:mail:/var/spool/mail:/sbin/nologin nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin operator:x:11:0:operator:/root:/sbin/nologin polkitd:x:998:996:User for polkitd:/:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin raymond:x:1000:1000::/home/raymond:/bin/bash root:x:0:0:root:/root:/bin/bash shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin sssd:x:996:993:User for sssd:/:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin [root@rocky8 data]# sort -t: -k3 /etc/passwd #-t specifies the separator - k specifies the column, which is arranged by character by default root:x:0:0:root:/root:/bin/bash raymond:x:1000:1000::/home/raymond:/bin/bash boss:x:1001:1001::/home/boss:/bin/bash operator:x:11:0:operator:/root:/sbin/nologin bin:x:1:1:bin:/bin:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin halt:x:7:0:halt:/sbin:/sbin/halt sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin mail:x:8:12:mail:/var/spool/mail:/sbin/nologin dbus:x:81:81:System message bus:/:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin sssd:x:996:993:User for sssd:/:/sbin/nologin unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin polkitd:x:998:996:User for polkitd:/:/sbin/nologin systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin [root@rocky8 data]# sort -t: -k3 -n /etc/passwd #-n in numerical order root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail:/sbin/nologin operator:x:11:0:operator:/root:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin dbus:x:81:81:System message bus:/:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin sssd:x:996:993:User for sssd:/:/sbin/nologin unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin polkitd:x:998:996:User for polkitd:/:/sbin/nologin systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin raymond:x:1000:1000::/home/raymond:/bin/bash boss:x:1001:1001::/home/boss:/bin/bash nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin [root@rocky8 data]# sort -t: -k3 -nr /etc/passwd #-r reverse order nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin boss:x:1001:1001::/home/boss:/bin/bash raymond:x:1000:1000::/home/raymond:/bin/bash systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin polkitd:x:998:996:User for polkitd:/:/sbin/nologin unbound:x:997:994:Unbound DNS resolver:/etc/unbound:/sbin/nologin sssd:x:996:993:User for sssd:/:/sbin/nologin systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin dbus:x:81:81:System message bus:/:/sbin/nologin sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin tss:x:59:59:Account used for TPM access:/dev/null:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin operator:x:11:0:operator:/root:/sbin/nologin mail:x:8:12:mail:/var/spool/mail:/sbin/nologin halt:x:7:0:halt:/sbin:/sbin/halt shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown sync:x:5:0:sync:/sbin:/bin/sync lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin bin:x:1:1:bin:/bin:/sbin/nologin root:x:0:0:root:/root:/bin/bash [root@rocky8 data]# df Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 382688 0 382688 0% /dev tmpfs 400580 0 400580 0% /dev/shm tmpfs 400580 5692 394888 2% /run tmpfs 400580 0 400580 0% /sys/fs/cgroup /dev/sda2 104806400 2320316 102486084 3% / /dev/sda3 52403200 398444 52004756 1% /data /dev/sda1 1038336 191796 846540 19% /boot tmpfs 80116 0 80116 0% /run/user/0 [root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5 Use 0 0 2 0 3 1 19 0 [root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5 |tail -n +2 0 0 2 0 3 1 19 0 [root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5 |tail -n +2 |sort -nr 19 3 2 1 0 0 0 0 [root@rocky8 data]# df |tr -s ' ' '%' |cut -d% -f5 |tail -n +2 |sort -nr |head -1 19 [root@rocky8 data]# cat >aa.txt <<EOF > a > b > a > c > a > c > b > EOF [root@rocky8 data]# sort aa.txt a a a b b c c [root@rocky8 data]# sort -u aa.txt #-u weight removal a b c
example:
[root@rocky8 data]# cut -d: -f1,3 /etc/passwd|sort -t: -k2 -nr |head -n3 nobody:65534 boss:1001 raymond:1000 #Statistics log access [root@rocky8 data]# cut -d" " -f1 access_log |sort -u|wc -l 201
Example: Statistics partition utilization
[root@rocky8 data]# df Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 382688 0 382688 0% /dev tmpfs 400580 0 400580 0% /dev/shm tmpfs 400580 5692 394888 2% /run tmpfs 400580 0 400580 0% /sys/fs/cgroup /dev/sda2 104806400 2320316 102486084 3% / /dev/sda3 52403200 407316 51995884 1% /data /dev/sda1 1038336 191796 846540 19% /boot tmpfs 80116 0 80116 0% /run/user/0 #View the highest partition utilization value [root@rocky8 data]# df| tr -s ' ' '%'|cut -d% -f5|sort -nr|head -1 19 [root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort 0 0 0 0 1 19 2 3 [root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -n 0 0 0 0 1 2 3 19 [root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -n |tail -n1 19 [root@centos8 ~]#df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -nr 15 5 1 1 1 0 0 0 [root@rocky8 data]# df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -nr |head -1 19
Interview question: there are two files, a.txt and b.txt. Merge the two files and ensure that each number is unique when outputting
#a.txt is unique in this document 200 100 34556 23 ... #b.txt is unique in this document 123 43 200 3321 ... #This is to remove the duplicate lines after merging the two files without retaining them 100 345563 123 43 3321 ...
2.6.3 weight removal uniq
The uniq command removes contiguous duplicate lines from the input
Format:
uniq [OPTION]... [FILE]...
Common options:
-c: Displays the number of repetitions per line -d: Show only duplicate rows -u: Show only rows that have not been repeated
uniq is often used with the sort command:
example:
[root@rocky8 data]# cat > aa.txt <<EOF > a > b > a > a > c > c > a > c > b > b > EOF [root@rocky8 data]# cat aa.txt a b a a c c a c b b [root@rocky8 data]# uniq aa.txt #uniq removes adjacent duplicate characters a b a c a c b [root@rocky8 data]# uniq -c aa.txt #-c repeat several times 1 a 1 b 2 a 2 c 1 a 1 c 2 b [root@rocky8 data]# cat aa.txt a b a a c c a c b b [root@rocky8 data]# sort aa.txt a a a a b b b c c c [root@rocky8 data]# sort aa.txt | uniq -c #Statistics are repeated several times 4 a 3 b 3 c [root@rocky8 data]# sort aa.txt |uniq -c|sort -nr 4 a 3 c 3 b [root@rocky8 data]# sort aa.txt |uniq -c|sort -nr | head -1 4 a
example:
sort userlist.txt | uniq -c
Example: count the requests with the most log accesses
[root@rocky8 data]# cut -d" " -f1 access_log |sort |uniq -c|sort -nr |head -3 4870 172.20.116.228 3429 172.20.116.208 2834 172.20.0.222 [root@10-9-24-182 ~]# lastb |tr -s ' ' |cut -d ' ' -f3 |sort |uniq -c |sort -nr |head -3 34096 113.141.66.163 24460 222.186.10.188 16449 119.118.20.161
Example: remote host IP with the most concurrent connections
[root@rocky8 data]# ss -nt |tail -n +2 |tr -s ' ' : |cut -d: -f6 |sort |uniq -c |sort -nr |head -2 7 10.0.0.1 2 10.0.0.7
Example: take the same and different lines of two files
[root@rocky8 data]# cat > test1.txt <<EOF > a > b > 1 > c > EOF [root@rocky8 data]# cat test1.txt a b 1 c [root@rocky8 data]# cat > test2.txt <<EOF > b > e > f > c > 1 > 2 > EOF [root@rocky8 data]# cat test2.txt b e f c 1 2 #Get the common line of the file [root@rocky8 data]# cat test1.txt test2.txt | sort |uniq -d 1 b c #Get different lines of the file [root@rocky8 data]# cat test1.txt test2.txt | sort |uniq -u 2 a e f
2.6.4 comparison documents
2.6.4.1 diff
The diff command compares the differences between two files. The output of the diff command can be saved in a file called "patch"
Common options
-u Option to output "unified"( unified)"diff Format file, which is most suitable for patch files
example:
[root@rocky8 data]# cat > f1.txt <<EOF > zhang > wang > li > zhao > EOF [root@rocky8 data]# cat f1.txt zhang wang li zhao [root@rocky8 data]# cat > f2.txt <<EOF > zhangliang > wangsir > li > zhao > gao > EOF [root@rocky8 data]# cat f2.txt zhangliang wangsir li zhao gao [root@rocky8 data]# diff f1.txt f2.txt 1,2c1,2 < zhang < wang --- > zhangliang > wangsir 4a5 > gao [root@rocky8 data]# diff -u f1.txt f2.txt --- f1.txt 2021-10-07 21:30:15.134764613 +0800 +++ f2.txt 2021-10-07 21:31:20.962768219 +0800 @@ -1,4 +1,5 @@ -zhang -wang +zhangliang +wangsir li zhao +gao [root@centos8 ~]#diff -u f1.txt f2.txt > f.patch [root@rocky8 data]# diff -u f1.txt f2.txt > f.patch [root@rocky8 data]# rm -f f2.txt [root@rocky8 data]# patch -b f1.txt f.patch #Here, the recovery will restore f2.txt to f1.txt. Before using - b recovery, back up f1.txt to f1.txt.orig. The recovered file f1.txt is the original f2.txt file patching file f1.txt [root@rocky8 data]# cat f1.txt zhangliang wangsir li zhao gao [root@rocky8 data]# cat f1.txt.orig zhang wang li zhao
2.6.4.2 patch
patch copies changes made in other files (use with caution)
Common options:
-b Option to automatically back up changed files
example:
diff -u foo.conf foo2.conf > foo.patch patch -b foo.conf foo.patch
2.7 practice
1. Find the IPv4 address of the local machine in the ifconfig "network card name" command result
2. Find out the maximum percentage value of partition space utilization
3. Find out the user name, UID and shell type of the maximum user UID
4. Find out the permissions of / tmp and display them in digital form
5. Count the number of connections of each remote host IP currently connected to the local machine, and sort by the largest to the smallest