Deep understanding of system calls

Experiment content:

Find a system call whose system call number is the same as the last two digits of the student number [i.e. system call 97]
Trigger the system call through assembly instructions
Trace the kernel process of the system call through gdb
Focus on the analysis of the save site, recovery site and system call return of the system call entry, as well as the change of the kernel stack state during the system call process

Experimental environment:

Ubuntu 18.04.4 under VMWare virtual machine, linux-5.4.34 is used in the experiment.

1 Environmental preparation

1.1 kernel compilation

Patch operation of fallback Experiment 1:

cd linux-5.4.34 patch -R -p1 < ../mykernel-2.0_for_linux-5.4.34.patch make defconfig

To modify the kernel compilation configuration and recompile:

#Turn on debug related options Kernel hacking ---> Compile-time checks and compiler options ---> [*] Compile the kernel with debug info [*] Provide GDB scripts for kernel debugging [*] Kernel debugging #Close KASLR, otherwise breakpoint fails Processor type and features ---> [] Randomize the address of the kernel image (KASLR)

make menuconfig make -j$(nproc)

Start the kernel. At this time, the kernel fails to run normally. The Kernel panic error is prompted:

qemu-system-x86_64 -kernel arch/x86/boot/bzImage

According to the error message, it can be seen that the kernel cannot be mounted due to the lack of necessary root file system.

1.2 make root file system

When the computer is powered on and started, the bootloader first loads the kernel, and then the kernel needs to mount the memory root file system, which contains the necessary device drivers and tools.

In order to simplify the experimental environment, only BusyBox is used to make the minimal memory root file system and provide the basic user state executable program.

First from https://www.busybox.net Download the busybox source code and decompress it. After decompressing, configure, compile and install it.

axel -n 20 https://busybox.net/downloads/busybox-1.31.1.tar.bz2 tar -jxvf busybox-1.31.1.tar.bz2

The configuration is compiled into a static link without using a dynamic link library.

cd busybox-1.31.1 make menuconfig

Compile and install. It will be installed in the source directory by default_ Install directory.

make -j$(nproc) && make install

To make a memory root file system image:

mkdir rootfs cd rootfs cp ../busybox-1.31.1/_install/* ./ -rf mkdir dev proc sys home sudo cp -a /dev/ dev/

Add init script file (rootfs/init) in the root file system directory. The init content is as follows:

#!/bin/sh mount -t proc none /proc mount -t sysfs none /sys echo "Wellcome MengningOS!" echo "--------------------" cd home /bin/sh

Add executable rights to init script:

chmod +x init

Package as memory root file system image:

find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../rootfs.cpio.gz

Test mounting the root file system to see if init script is executed after kernel startup:

qemu-system-x86_64 -kernel linux-5.4.34/arch/x86/boot/bzImage -initrd rootfs.cpio.gz

After bootloader successfully loads the root file system into memory, the kernel will mount it to the root directory.
Then run the init script in the root file system to perform some startup tasks, and finally mount the real disk root file system.

2 system call

2.1 find system call

In linux-5.4.34/arch/x86/entry/syscalls/syscall_ Find the corresponding system call in the 64.tbl file:

2.2 trigger system call

getrlimit is used to obtain the limited usage of various system resources that each process can create.

Create a new getrlimit in the rootfs/home / directory_ Test. C:

#include <stdio.h> #include <sys/resource.h> int main() { struct rlimit limit; int ret = getrlimit(RLIMIT_NOFILE, &limit); printf("ret = %d,\tcur = %ld,\tmax = %ld\n", ret, limit.rlim_cur, limit.rlim_max); return 0; }

The function returns 0 on success and 1 on failure.

Where, RLIMIT_NOFILE represents the maximum number of files that can be opened per process.

limit.rlim_cur is the current software limit, limit.rlim_max is the maximum hardware limit.

Using static compilation:

gcc -o getrlimit_test getrlimit_test.c -static

The code test results are as follows:

After the getrlimit test is successful, the system call is triggered by writing assembly code:

#include <stdio.h> #include <sys/resource.h> int main() { struct rlimit limit; int ret = -1; asm volatile( "movq %2, %%rsi\n\t" "movl %1, %%edi\n\t" "movl $0x61, %%eax\n\t" "syscall\n\t" "movq %%rax,%0\n\t" :"=m"(ret) :"a"(RLIMIT_NOFILE), "b"(&limit) ); printf("ret = %d,\tcur = %ld,\tmax = %ld\n", ret, limit.rlim_cur, limit.rlim_max); return 0; }

2.3 trace system call kernel process

Remake the root file system:

find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../rootfs.cpio.gz

To start qemu from the command line only:

qemu-system-x86_64 -kernel linux-5.4.34/arch/x86/boot/bzImage -initrd rootfs.cpio.gz -S -s -nographic -append "console=ttyS0"

Start a new terminal for gdb debugging:

cd linux-5.4.34 gdb vmlinux target remote:1234 c

Add breakpoint test:

b __x64_sys_getrlimit

2.4 system call process analysis

Read and analyze the save site, recovery site and system call return of the system call portal, and focus on the changes of the kernel stack state during the system call process