On Ext4 file system structure (combined with Linux instruction practice)

catalogue

introduction

Main references:

Basic concepts:

Block

Group

Inode

practice

View all mounted devices

View file system superblocks

Check the Stat of a small file. Why are Blocks 8?

View Group information

Analyze INODE structure

Reads the data of the specified block number

Try Inode for a slightly larger file!

Try a larger file (~ 900MB)  

Analyze folder structure

Contents to be supplemented

introduction

Two days ago, the development post of Telecom Tianyi cloud on the HXD line asked what kinds of file systems there are... On the one hand, I felt that we must know astronomy and geography in order to develop. On the other hand, I felt that the endorsement was limited to the ability to improve. Therefore, this paper attempts to explore the underlying structure of Ext4 file system in Linux system.

Main references:

The English document lists the underlying structure in detail

The only disadvantage is that there is no legend.

Bird's private dishes

An excellent tutorial combining theoretical knowledge and practice, but at present, this paper does not introduce the Ext4 file system.

Basic concepts:

Block

A block is a group of sectors between 1KiB and 64KiB, and the number of sectors must be an integral power of 2.

Here, sector refers to cluster, which is the smallest storage unit of disk on the physical level.

Group

Blocks are in turn grouped into larger units called block groups.

Quoted from https://blog.51cto.com/u_15265005:

Ext4 file system divides disk space into several groups, and manages disk space in this group. This group is called Block group   Group), which contains metadata to manage the disks in this area.

Inode

Inode refers to fields in an inode table entry.

In the Linux operating system, files are identified by inodes, and each file has an inode node on the disk. For Ext2 file systems, these inode nodes are usually placed in a relatively centralized area, which is called the inode table.

practice

View all mounted devices

#View instructions for the file system
df -T

#View disk name
ll /dev/*

View file system superblocks

sudo dumpe2fs -h /dev/sda5
Inode count:              1277952
Block count:              5111040
Reserved block count:     255552
Free blocks:              2173751
Free inodes:              1005093
First block:              0

Block size:               4096
 One Block Can save 4096 bit,block It is the basic unit of file storage in the file system

Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768

Inodes per group:         8192
 One Inode Corresponds to one file, so a group can have up to 8192 files.
A file system can contain several files Group,therefore Inode count Much larger than here Inodes per group

Inode blocks per group:   512
Inode It's too much, so it takes up more than one block. 
For this attribute, my understanding is: Inode blocks per group * (BlockSize/Inode size) = BlockSize
 512 * ( 4096/256 ) = 8192 

Flex block group size:    16

Filesystem created:       Mon Oct  5 13:24:59 2020
Last mount time:          Thu Sep 16 00:01:46 2021
Last write time:          Thu Sep 16 00:01:42 2021
Mount count:              40
Maximum mount count:      -1
Last checked:             Mon Oct  5 13:24:59 2020
Check interval:           0 (<none>)
Lifetime writes:          88 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11

Inode size:               256
 Here Inode size Reference Inode Node size (in bits)

Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
First orphan inode:       655386
Default directory hash:   half_md4
Directory Hash Seed:      11778f68-41b9-4d07-9b77-714815ee7721

Check the Stat of a small file. Why are Blocks 8?

linux@ubuntu:~/temp$ cat 1.txt
123456
linux@ubuntu:~/temp$ stat 1.txt
  File: 1.txt
  Size: 7         	Blocks: 8          IO Block: 4096   regular file
Device: 805h/2053d	Inode: 398584      Links: 2
Access: (0664/-rw-rw-r--)  Uid: ( 1000/   linux)   Gid: ( 1000/   linux)
Access: 2021-09-16 04:05:42.486923173 -0700
Modify: 2021-09-11 19:36:52.889907621 -0700
Change: 2021-09-11 19:36:52.889907621 -0700
 Birth: -

answer: https://blog.csdn.net/daiyudong2020/article/details/53897775
The definition of Block in Stat instruction is a unit, which is equivalent to 512 bits.
On the other hand, in the above, the Block Size is 4096 (bits). Regardless of the size, the two files will not occupy the same Block (in other words, Block is the basic storage unit in the file system and cannot be divided). Finally, 4096 / 512 = 8  .

View Group information

Group information(Above query'Super block'All instructions are returned together Group Information about): 
Group 4: (Blocks 131072-163839) csum 0xfa75 [INODE_UNINIT, ITABLE_ZEROED]
163839-131072+1=32768, Corresponding to above Blocks per group

  Block bitmap at 1032 (bg #0 + 1032), csum 0xf68bc9f4
  Inode bitmap at 1048 (bg #0 + 1048), csum 0x00000000
  Inode table at 3108-3619 (bg #0 + 3108)
  Corresponding to the above question Inode blocks per group(512)

  corresponding Inode Per Group(8192)
  0 free blocks, 8192 free inodes, 0 directories, 8192 unused inodes

  Free blocks:
  Free inodes: 32769-40960

Analyze INODE structure

  About small end sequence:

All fields in ext4 are written to disk in little endian order. 

HOWEVER, all fields in jbd2 (the journal) are written to disk in big-endian order.

The Journal module is related to disk troubleshooting and will not be discussed in depth here. In short, we will use the knowledge of small end order when interpreting the INODE structure in the following part.

(Need to use debugfs,reference resources: https://blog.csdn.net/xingkong_678/article/details/40687209)
debugfs:  stat ./1.txt

--Output results————
Inode: 398584   Type: regular    Mode:  0664   Flags: 0x80000
0x80000	Inode uses extents (EXT4_EXTENTS_FL),EXT4 Medium Inode Node adoption Ext Tree method to store the sequence number of physical blocks occupied by a file (the contents of a file are stored in blocks orderly, but these blocks are Group The positions in are not necessarily continuous, the former is called "logical block" and the latter is called "physical block").

Generation: 1037448172    Version: 0x00000000:00000001
User:  1000   Group:  1000   Project:     0   Size: 7
File ACL: 0
Links: 2   Blockcount: 8
 there Blockcount Medium Block It is still a unit, equivalent to 512 bits

Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x613d67c4:d42ba694 -- Sat Sep 11 19:36:52 2021
 atime: 0x61432506:74176e94 -- Thu Sep 16 04:05:42 2021
 mtime: 0x613d67c4:d42ba694 -- Sat Sep 11 19:36:52 2021
crtime: 0x613d620c:592e44c4 -- Sat Sep 11 19:12:28 2021
Size of extra inode fields: 32
Inode checksum: 0xc67f4100
EXTENTS:
(0):2670562
--Output results————

Or the following instruction directly displays the serial number of the physical block occupied by the file
debugfs:  blocks ./1.txt

Reads the data of the specified block number

sudo dd if=/dev/sda5 bs=4096 count=1 skip=2670562
 If you accidentally lose the wrong bs,The instruction will output unexpected content. Presumably, the reason is that the instruction will be based on the given bs Partition the file system into blocks and find the corresponding blocks.

Try Inode for a slightly larger file!

-Output results——
Inode: 404013   Type: regular    Mode:  0600   Flags: 0x80000
Generation: 3563925921    Version: 0x00000000:00000001
User:  1000   Group:  1000   Project:     0   Size: 27781
File ACL: 0
Links: 1   Blockcount: 64
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x6142ec58:ac656cf4 -- Thu Sep 16 00:03:52 2021
 atime: 0x6143f98d:553064f8 -- Thu Sep 16 19:12:29 2021
 mtime: 0x613d9245:44243738 -- Sat Sep 11 22:38:13 2021
crtime: 0x5f7b2021:4a0c5a08 -- Mon Oct  5 06:31:13 2020
Size of extra inode fields: 32
Inode checksum: 0xcaf276a7
EXTENTS:
(ETB0):2111400, (0):1626117, (1):1618417, (2):1618715, (3):1618548, (4):2107063, (5):2105480, (6):2109895
 there ETB0 yes Extent A tree is a data block used to maintain the relationship between logical blocks and physical blocks.
(0,1,2....)It can be regarded as logical block serial number, and the number after colon is 1626117,1618417,...It can be regarded as the serial number of the physical block.
-Output results——

Try a larger file (~ 900MB)  

Inode: 12   Type: regular    Mode:  0600   Flags: 0x80000
Generation: 2105891747    Version: 0x00000000:00000001
User:     0   Group:     0   Project:     0   Size: 968110080
File ACL: 0
Links: 1   Blockcount: 1890848
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x5f7b109c:a60a2c64 -- Mon Oct  5 05:25:00 2020
 atime: 0x6142ebda:b44661b8 -- Thu Sep 16 00:01:46 2021
 mtime: 0x5f7b109c:a60a2c64 -- Mon Oct  5 05:25:00 2020
crtime: 0x5f7b811b:2df0e80c -- Mon Oct  5 13:24:59 2020
Size of extra inode fields: 32
Inode checksum: 0x6e263862
EXTENTS:
(ETB0):33796, (0-32767):34816-67583, (32768-63487):67584-98303, (63488-96255):100352-133119, (96256-126975):133120-163839, (126976-159743):165888-198655, (159744-190463):198656-229375, (190464-223231):231424-264191, (223232-236354):264192-277314
 Different from the primary and secondary index nodes in textbooks, EXT4 Another method is used to record the data used Blocks. 
It is not difficult to notice that the intervals in parentheses correspond to 0-236354,One Block There are 8 512 bit blocks, 236354*8≈1890848,that is Blockcount
 So many Block Stored in several segments Block In consecutive segments, corresponding to each interval without brackets.

Extents are arranged as a tree. Each node of the tree begins with a struct ext4_extent_header.

A node in the extension tree occupies a Block, and the extension_header will indicate whether it is a leaf node or a non leaf node.

If the node is an interior node (eh.eh_depth > 0), the header is followed by eh.eh_entries instances of struct ext4_extent_idx; each of these index entries points to a block containing more nodes in the extent tree.

For non leaf nodes, it will contain several types of struct ext4_ extent_ EH of idx_ Entries, pointing to each node of the next layer.

If the node is a leaf node (eh.eh_depth == 0), then the header is followed by eh.eh_entries instances of struct ext4_extent; these instances point to the file's data blocks.

For leaf nodes, it will contain several types of struct ext4_ EH of ext_ Entries, pointing to specific file block segments (e.g. 34816-6758467584-98303,... Above).

The root node of the extent tree is stored in inode.i_block, which allows for the first four extents to be recorded without the use of extra metadata blocks.

When files occupy fewer blocks, they will be directly in inode. I_ Store Extent Nodes in block, saving space.

Next, we work with the structure diagram of the extension tree( address )And the hexadecimal code of ETB0 to view the structure of the Extent Node:  

eh_ Depth (purple): the number of node layers. If the value is zero, it indicates that the node is a leaf node.

ee_block (red): the first file block number corresponding to the range (extent)

ee_len (green): the number of blocks contained in the interval. The document indicates that if the value is > 32768, it indicates that the interval is not initialized (to be further studied).

Because the blocks contained in the interval are continuous, only the block number of the first block needs to be recorded here:
ee_start_hi (blue): the upper 16 digits of the actual block number

ee_start_lo (black): the lower 32 bits of the actual block number (corresponding to 3481667584100352 in the output result above)

Analyze folder structure

-Output results——
Inode: 414868   Type: directory    Mode:  0775   Flags: 0x80000
Generation: 1424482529    Version: 0x00000000:00000030
User:  1000   Group:  1000   Project:     0   Size: 4096
File ACL: 0
Links: 2   Blockcount: 8
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x6143e6b3:16f62484 -- Thu Sep 16 17:52:03 2021
 atime: 0x6143f20c:9b797f04 -- Thu Sep 16 18:40:28 2021
 mtime: 0x6143e6b3:16f62484 -- Thu Sep 16 17:52:03 2021
crtime: 0x604cb298:c1345480 -- Sat Mar 13 04:39:52 2021
Size of extra inode fields: 32
Inode checksum: 0x89eda6de
EXTENTS:
(0):1584381
-Output results——

Use the following instructions to view Block Contents in:
sudo dd if=/dev/sda5 bs=4096 count=1 skip=1584381 | hexdump -C

In Linux, the essence of a folder is the mapping of the file name to the Inode node:

Directory is more or less a flat file that maps an arbitrary byte string (usually ASCII) to an inode number on the filesystem.

Secondly, it should be noted that the folder structure in Ext4 file system is divided into two types: (1) linear (Classic) directories and (2) hashtree directories. If it is the latter, the Inode corresponding to the folder will be added with a flag with a value of 0x1000.

Next, combined with the references, we analyze the content of 158438 Block. The Flags value above is 0x80000, indicating that this is the first type of folder structure.

Inode (red): inode, the inode number corresponding to the file

rec_len (green): the length of the file structure (for example, describe that the first file occupies a total of 12 bytes, here is 0c 00).

name_len (blue): file name length

file_type (yellow): file type

??? (pink): unknown content. The front part corresponds to the file name "1.txt". Because it indicates that the file name length is 5, the pink part will not affect the file name, and because rec_len is 16, and the pink part is a part of the file structure corresponding to "1.txt". I speculate that this part of the content is only useless content that has not been assigned to zero.

Contents to be supplemented

  • Changes to the parent folder of a file after it is deleted
  • Structural analysis of HashTreeDirectories
  • Analysis of log system structure

Tags: Linux

Posted on Wed, 22 Sep 2021 20:52:09 -0400 by ineedhelpbigtime