The analysis of Linux kernel do_sea()

1. Do? Sea() function purpose

This function is used to handle hardware synchronization errors. It can be divided into the following four situations.
"synchronous external abort": the kernel processing entry of SEA.
"level x (translation table walk)": unknown
"synchronous parity or ECC error": RAS error handling to be supported
"level x synchronous parity error (translation table walk)": RAS error handling to be supported

2. Do? Sea() process

The function code is as follows:

641 static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
642 {
643         const struct fault_info *inf;
644         void __user *siaddr;
646         inf = esr_to_fault_info(esr);
648         /*
649          * Return value ignored as we rely on signal merging.
650          * Future patches will make this more robust.
651          */
652         apei_claim_sea(regs);
654         if (esr & ESR_ELx_FnV)
655                 siaddr = NULL;
656         else
657                 siaddr  = (void __user *)addr;
658         arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
660         return 0;
661 }

apei_claim_sea() is used to parse the APEI table and get the data passed by the firmware.
Arm64 ﹣ notify ﹣ die() is the key function of processing. Processing distinguishes between user state and kernel state. User state directly kills processes, while kernel state calls die(). Die () is actually the execution program of oops. Normally, oops only prints some call stack information. If the kernel can continue to run, it can continue to run. However, if panic [on] oops = 1 is configured, it will panic.

By the way, the design expectation of SEA is that if the kernel state triggers SEA, panic is required instead of letting the kernel run.

3. panic process

Then the system panic, what is the state of the CPU?
From the code point of view, this handler will turn off the local interrupt first, and then execute SMP ﹣ send ﹣ stop() or crash ﹣ SMP ﹣ send ﹣ stop() to stop other cores. If the scheduled restart is configured, the emergency reset() will be executed for emergency restart. Otherwise, the local interrupt will be opened again to prevent the printing of some important information from being lost.

Refer to kernel/panic.c


This function is implemented in arch/arm64/kernel/smp.c. The main thing to do is to call SMP ﹣ cross ﹣ call (& mask, IPI ﹣ CPU ﹣ stop) to stop other cores.
The execution of SMP cross call involves the IPI mechanism, which can be described as follows:
linux smp Murder Caused by a bug

In fact, it is to send a message to another core and then let it handle it. The processing function is handle ou ipi().
The processing method of IPI CPU stop event is to mark the core in the offline state, and then MASK the DAIF to let the core run in an idle state, as shown in the following function:

 833 static void local_cpu_stop(void)
 834 {
 835         set_cpu_online(smp_processor_id(), false);
 837         local_daif_mask();
 838         sdei_mask_local_cpu();
 839         cpu_park_loop();
 840 }

The implementation of local if mask() is as follows:

 21 /* mask/save/unmask/restore all exceptions, including interrupts. */
 22 static inline void local_daif_mask(void)
 23 {
 24         WARN_ON(system_has_prio_mask_debugging() &&
 25                 (read_sysreg_s(SYS_ICC_PMR_EL1) == (GIC_PRIO_IRQOFF |
 26                                                     GIC_PRIO_PSR_I_SET)));
 28         asm volatile(
 29                 "msr    daifset, #0xf           // local_daif_mask\n"
 30                 :
 31                 :
 32                 : "memory");
 34         /* Don't really care for a dsb here, we don't intend to enable IRQs */
 35         if (system_uses_irq_prio_masking())
 36                 gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
 38         trace_hardirqs_off();
 39 }

Note: after the mask lives in daif, the interrupt of EL3 can still be entered.

Published 14 original articles, won praise 4, visited 7982
Private letter follow

Tags: Linux

Posted on Thu, 06 Feb 2020 05:32:15 -0500 by sectachrome