Trap source code analysis

Switching between user space and kernel space is often called trap

Three forms of trap
  1. Raised by system call
  2. Abnormal occurrence
  3. Equipment interruption (time interruption, IO interruption, network interruption, etc.)
Permissions for supervise mode

What is the difference between user mode and kernel mode? In fact, the difference is small:

One of the things is that you can now read and write control registers. For example, when you are in supervisor mode, you can: read and write the SATP register, that is, the pointer to page table; STVEC, that is, the instruction address of the kernel processing trap; SEPC, save the program counter when trap occurs; SSCRATCH, etc. In supervisor mode, you can read and write these registers, but user code cannot do so.

Another thing supervisor mode can do is that it can use PTE_ PTE with u flag bit 0 (see page table item for details) Page table ). When PTE_ When the U flag bit is 1, it indicates that the user code can use this page table; If this flag bit is 0, only supervisor mode can use this page table.

This is the only thing that super mode can do.

ecall instruction

The system uses the ecall instruction to perform system calls

First, ecall changes the code from user mode to supervisor mode

Second, ecall saves the value of the program counter in the SEPC register.

Third, ecall will jump to the instruction pointed to by the STVEC register trampoline.S

The whole process of trap

with exec System Call For example.

First, the exec function is invoked in the user program. When the compiler parses the exec function, it does not call its source code. Instead, it writes the number of the Exec System call into the A7 register, executes the ecall instruction, and the ecall instruction will jump to trampoline.S.

The system has entered the kernel state from the user state, but the page table is still the user page table. trampoline.S mainly uses trapFrame (Appendix) saves the system state and user register for restoring the user process. This operation also realizes the user process's non perception of process switching. After storing the state to be stored in the user page table, the system will switch the kernel page table and execute the corresponding kernel code, that is, the usertrap function.

usertrap calls the corresponding processing function according to the different conditions of the trap (the interrupt reason is stored in the scause register).

Code parsing

The following code is the source code of the operating system, which contains a lot of details. What you need to know is the whole process and system. Don't pay too much attention to details

Saving process of trap
  1. First, we need to save 32 user registers. Because obviously we need to resume the execution of the user application, especially when the user application is randomly interrupted by a device interrupt. (store user registers)
  2. The program counter also needs to be saved somewhere. It is almost the same as a user register. We need to be able to continue to execute the user program where the user program is interrupted. (store some system registers such as PC)
  3. We need to change the mode to supervisor mode because we want to use all kinds of privileged instructions in the kernel. (modify operation mode)
  4. The SATP(Supervisor Address Translation and Protection) register is now pointing to the user page table, which only contains the memory mapping and one or two other mappings required by the user program. It does not contain the memory mapping of the whole kernel data. Therefore, before running the kernel code, we need to point the SATP to the kernel page table. (modify SATP to obtain the page table of the kernel)
  5. We need to point the stack register to an address in the kernel, because we need a stack to call the kernel's C function. (convert to kernel stack)
  6. Once we set it up and all the hardware states are suitable for use in the kernel, we need to jump into the C code usertrap function of the kernel.
        # code to switch between user and kernel space.
        # this code is mapped at the same virtual address
        # (TRAMPOLINE) in user and kernel space so that
        # it continues to work when it switches page tables.
	# kernel.ld causes this to be aligned
        # to a page boundary.
.section trampsec
.globl trampoline
.align 4
.globl uservec
//Save trap status
        # trap.c sets stvec to point here, so
        # traps from user space start here,
        # in supervisor mode, but with a
        # user page table.
        # sscratch points to where the process's p->trapframe is
        # mapped into user space, at TRAPFRAME.
				# swap a0 and sscratch
        # so that a0 is TRAPFRAME
        csrrw a0, sscratch, a0

        # save the user registers in TRAPFRAME
        sd ra, 40(a0)
        sd sp, 48(a0)
        sd gp, 56(a0)
        sd tp, 64(a0)
        sd t0, 72(a0)
        sd t1, 80(a0)
        sd t2, 88(a0)
        sd s0, 96(a0)
        sd s1, 104(a0)
        sd a1, 120(a0)
        sd a2, 128(a0)
        sd a3, 136(a0)
        sd a4, 144(a0)
        sd a5, 152(a0)
        sd a6, 160(a0)
        sd a7, 168(a0)
        sd s2, 176(a0)
        sd s3, 184(a0)
        sd s4, 192(a0)
        sd s5, 200(a0)
        sd s6, 208(a0)
        sd s7, 216(a0)
        sd s8, 224(a0)
        sd s9, 232(a0)
        sd s10, 240(a0)
        sd s11, 248(a0)
        sd t3, 256(a0)
        sd t4, 264(a0)
        sd t5, 272(a0)
        sd t6, 280(a0)

				# save the user a0 in p->trapframe->a0
        csrr t0, sscratch
        sd t0, 112(a0)

        # restore kernel stack pointer from p->trapframe->kernel_sp
        ld sp, 8(a0)

        # make tp hold the current hartid, from p->trapframe->kernel_hartid
        ld tp, 32(a0)

        # load the address of usertrap(), p->trapframe->kernel_trap
        ld t0, 16(a0)

        # restore kernel page table from p->trapframe->kernel_satp
        ld t1, 0(a0)
        csrw satp, t1
        sfence.vma zero, zero

        # a0 is no longer valid, since the kernel page
        # table does not specially map p->tf.

        # jump to usertrap(), which does not return
        jr t0
//trap status recovery
.globl userret
        # userret(TRAPFRAME, pagetable)
        # switch from kernel to user.
        # usertrapret() calls here.
        # a0: TRAPFRAME, in user page table.
        # a1: user page table, for satp.

        # switch to the user page table.
        csrw satp, a1
        sfence.vma zero, zero

        # put the saved user a0 in sscratch, so we
        # can swap it with our a0 (TRAPFRAME) in the last step.
        ld t0, 112(a0)
        csrw sscratch, t0

        # restore all but a0 from TRAPFRAME
        ld ra, 40(a0)
        ld sp, 48(a0)
        ld gp, 56(a0)
        ld tp, 64(a0)
        ld t0, 72(a0)
        ld t1, 80(a0)
        ld t2, 88(a0)
        ld s0, 96(a0)
        ld s1, 104(a0)
        ld a1, 120(a0)
        ld a2, 128(a0)
        ld a3, 136(a0)
        ld a4, 144(a0)
        ld a5, 152(a0)
        ld a6, 160(a0)
        ld a7, 168(a0)
        ld s2, 176(a0)
        ld s3, 184(a0)
        ld s4, 192(a0)
        ld s5, 200(a0)
        ld s6, 208(a0)
        ld s7, 216(a0)
        ld s8, 224(a0)
        ld s9, 232(a0)
        ld s10, 240(a0)
        ld s11, 248(a0)
        ld t3, 256(a0)
        ld t4, 264(a0)
        ld t5, 272(a0)
        ld t6, 280(a0)

	# restore user a0, and save TRAPFRAME in sscratch
        csrrw a0, sscratch, a0
        # return to user mode and user pc.
        # usertrapret() set up sstatus and sepc.

User trap handler
  int which_dev = 0;
  //Judge system mode status
  if((r_sstatus() & SSTATUS_SPP) != 0)
    panic("usertrap: not from user mode");

  // send interrupts and exceptions to kerneltrap(),
  // since we're now in the kernel.
  // Set the interrupt handler to kernelvec
  // Write the address of the kernelvec function to the stvec register
  // Corresponding to interrupts in user model, kernel VEC is used to handle interrupts in kernel mode
  // The two processes are very similar. kernelvec is simpler, so I won't explain it

  struct proc *p = myproc();
  //Save user's PC value
  // save user program counter.
  p->trapframe->epc = r_sepc();
  //Judge the cause of the exception according to the value of the scause register
  if(r_scause() == 8){
    // system call


    // sepc points to the ecall instruction,
    // but we want to return to the next instruction.
    // Set return position
    p->trapframe->epc += 4;
    // an interrupt will change sstatus &c registers,
    // so don't enable until done with those registers.
    // After the register is saved, open the interrupt
		// Execute system call
  } else if((which_dev = devintr()) != 0){
    // Processing device interrupt
    // ok
  } else {
    printf("usertrap(): unexpected scause %p pid=%d\n", r_scause(), p->pid);
    printf("            sepc=%p stval=%p\n", r_sepc(), r_stval());
    p->killed = 1;


  // give up the CPU if this is a timer interrupt.
  if(which_dev == 2)
  //After the trap is processed, the user process state is restored


System call function (call different function pointers according to the contents stored in A7 register)

extern uint64 sys_chdir(void);
extern uint64 sys_close(void);
extern uint64 sys_dup(void);
extern uint64 sys_exec(void);
extern uint64 sys_exit(void);
extern uint64 sys_fork(void);
extern uint64 sys_fstat(void);
extern uint64 sys_getpid(void);
extern uint64 sys_kill(void);
extern uint64 sys_link(void);
extern uint64 sys_mkdir(void);
extern uint64 sys_mknod(void);
extern uint64 sys_open(void);
extern uint64 sys_pipe(void);
extern uint64 sys_read(void);
extern uint64 sys_sbrk(void);
extern uint64 sys_sleep(void);
extern uint64 sys_unlink(void);
extern uint64 sys_wait(void);
extern uint64 sys_write(void);
extern uint64 sys_uptime(void);

static uint64 (*syscalls[])(void) = {
[SYS_fork]    sys_fork,
[SYS_exit]    sys_exit,
[SYS_wait]    sys_wait,
[SYS_pipe]    sys_pipe,
[SYS_read]    sys_read,
[SYS_kill]    sys_kill,
[SYS_exec]    sys_exec,
[SYS_fstat]   sys_fstat,
[SYS_chdir]   sys_chdir,
[SYS_dup]     sys_dup,
[SYS_getpid]  sys_getpid,
[SYS_sbrk]    sys_sbrk,
[SYS_sleep]   sys_sleep,
[SYS_uptime]  sys_uptime,
[SYS_open]    sys_open,
[SYS_write]   sys_write,
[SYS_mknod]   sys_mknod,
[SYS_unlink]  sys_unlink,
[SYS_link]    sys_link,
[SYS_mkdir]   sys_mkdir,
[SYS_close]   sys_close,

  int num;
  struct proc *p = myproc();

  num = p->trapframe->a7;
  if(num > 0 && num < NELEM(syscalls) && syscalls[num]) {
    p->trapframe->a0 = syscalls[num]();
  } else {
    printf("%d %s: unknown sys call %d\n",
            p->pid, p->name, num);
    p->trapframe->a0 = -1;

Tags: Operating System

Posted on Thu, 04 Nov 2021 13:33:31 -0400 by Lucky_PHP_MAN