[embedded] the "lower half" mechanism of Linux interrupt processing


Interrupts are divided into hardware interrupts and software interrupts. There are two main principles for interrupt processing: one is that it cannot be nested, and the other is that the faster the better. In Linux, interrupt processing is divided into "upper half" and "lower half" processing mechanisms.

1, Interrupt processing "lower half" mechanism

Interrupt service programs are generally executed under the condition that interrupt requests are closed, so as to avoid nesting and complicate interrupt control. However, interrupt is a random event, which will come at any time. If the interrupt is turned off for too long, the CPU can not respond to other interrupt requests in time, resulting in the loss of interrupt.

Therefore, the goal of Linux kernel is to process interrupt requests as soon as possible and postpone more processing as much as possible. For example, suppose a data block has reached the network cable. When the interrupt controller receives the interrupt request signal, the Linux kernel simply marks the arrival of the data, and then returns the processor to its previous running state. The rest of the processing will be carried out later (for example, move the data into a buffer, and the process receiving the data can find the data in the buffer).

Therefore, the kernel divides interrupt processing into two parts: the top half and the bottom half. The top half (i.e. interrupt service program) is executed immediately by the kernel, while the bottom half (i.e. some kernel functions) is reserved for later processing.

First: a fast "top half" to handle requests from hardware, which must be terminated before a new interrupt is generated. Usually, little work is done in this part except to move or transfer data between the device and some memory buffers (if your device uses DMA, more than these) to determine whether the hardware is in a sound state.
Second, the "lower half" runtime allows interrupt requests, while the upper half runtime closes interrupts, which is the main difference between the two.

When does the kernel execute the lower half and how does it organize the lower half?

This is the second half of the implementation mechanism we want to discuss. This mechanism has been continuously improved in the evolution of the kernel. In the previous kernel, this mechanism was called bottom half (hereinafter referred to as BH). However, this bottom half mechanism of Linux has two disadvantages:

  • At any time, only one CPU of the system can execute BH code to prevent two or more CPUs from interfering with each other by executing BH functions at the same time. Therefore, the execution of BH code is strictly "serialized".
  • Nesting of BH functions is not allowed.

These two disadvantages are insignificant in a single CPU system, but they are very fatal in an SMP system. Because the strict serialization of BH mechanism obviously does not make full use of the multi CPU characteristics of SMP system. Therefore, new developments and improvements have been made in the version after 2.4. The goal of the improvement is to enable the lower half to be executed in parallel on the multiprocessor, and help the driver developer to develop the driver. The following mainly introduces the "lower half" processing mechanism in three 2.6 kernels:

  • Soft interrupt request (softirq) mechanism
  • Tasklet mechanism
  • Work queue mechanism
  • threaded_irq mechanism

The comparison of the above three mechanisms is shown in the figure below

2, Soft interrupt request (softirq) mechanism

The softirq mechanism of Linux is closely inseparable from SMP. Therefore, the whole design and implementation of softirq mechanism has always implemented an idea: "Who marks, Who runs", that is, the CPU that triggers the soft interrupt is responsible for executing the soft interrupt triggered by it, and each CPU has its own soft interrupt trigger and control mechanism. This design idea also makes softirq mechanism make full use of the performance and characteristics of SMP system.

2.1 soft interrupt descriptor

Linux defines the data structure softirq in the include/linux/interrupt.h header file_ Action to describe a soft interrupt request, as follows:

/* PLEASE, avoid to allocate new softirqs, if you need not _really_ high
   frequency threaded job scheduling. For almost all the purposes
   tasklets are more than enough. F.e. all serial device BHs et
   al. should be converted to tasklets, not to softirqs.

	HI_SOFTIRQ=0,	//Soft interrupt for high priority
	NET_TX_SOFTIRQ,	//It is used to realize the transmission of network data
	NET_RX_SOFTIRQ, //It is used to receive network data
	TASKLET_SOFTIRQ, //Used to implement tasklet soft interrupt
	HRTIMER_SOFTIRQ, /* Unused, but kept as tools rely on the
			    numbering. Sigh! */
	RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */

	NR_SOFTIRQS		//The size is equal to 10, the last one

/* map softirq index to softirq name. update 'softirq_to_name' in
 * kernel/softirq.c when adding a new softirq.
extern const char * const softirq_to_name[NR_SOFTIRQS];

/* softirq mask and active fields moved to irq_cpustat_t in
 * asm/hardirq.h to get better cache usage.  KAO

struct softirq_action
	void	(*action)(struct softirq_action *);	//The function pointer action points to the service function of the soft interrupt request

asmlinkage void do_softirq(void);
asmlinkage void __do_softirq(void);

The function pointer action points to the service function of the soft interrupt request. Based on the above soft interrupt descriptor, Linux defines a global softirq in the kernel/softirq.c file_ VEC array:

static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;

Here, the system defines a total of 10 soft interrupt request descriptors. The soft interrupt request descriptor corresponding to the soft interrupt vector I (0 ≤ I ≤ 9) is softirq_vec[i]. This array is a system global array, that is, it is shared by all CPUs. One thing to note here is that although each CPU has its own trigger and control mechanism and only executes the soft interrupt requests triggered by itself, the soft interrupt service routines executed by each CPU are the same, that is, they all execute softirq_vec [] the soft interrupt service function defined in the array. The relevant codes of Linux in kernel/softirq.c are as follows:

   - No shared variables, all the data are CPU local.
   - If a softirq needs serialization, let it serialize itself
     by its own spinlocks.
   - Even if softirq is serialized, only local cpu is marked for
     execution. Hence, we get something sort of weak cpu binding.
     Though it is still not clear, will it result in better locality
     or will not.

   - NET RX softirq. It is multithreaded and does not require
     any global serialization.
   - NET TX softirq. It kicks software netdevice queues, hence
     it is logically serialized per device, but this serialization
     is invisible to common code.
   - Tasklets: serialized wrt itself.

#ifndef __ARCH_IRQ_STAT
DEFINE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat);	//Older kernel irq_cpustat_t irq_stat[NR_CPUS] ____cacheline_aligned;
EXPORT_PER_CPU_SYMBOL(irq_stat);	//Older kernel EXPORT_SYMBOL(irq_stat)

static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;

DEFINE_PER_CPU(struct task_struct *, ksoftirqd);

const char * const softirq_to_name[NR_SOFTIRQS] = {

2.2 soft interrupt trigger mechanism

To realize the idea of "who triggers, who executes", we must define its own trigger and control variables for each CPU. For this purpose, Linux defines the data structure IRQ in the \ Linux-5.4\arch\arm\include\asm\hardirq.h header file_ cpustat_ T to describe the interrupt statistics of a CPU, including member variables used to trigger and control soft interrupts. Data structure irq_cpustat_t is defined as follows:

IPI: inter processor interrupts

#define NR_IPI	7		//Inter-Processor Interrupts,IPI

typedef struct {
	unsigned int __softirq_pending;
	unsigned int ipi_irqs[NR_IPI];
} ____cacheline_aligned irq_cpustat_t;

#define __inc_irq_stat(cpu, member)	__IRQ_STAT(cpu, member)++
#define __get_irq_stat(cpu, member)	__IRQ_STAT(cpu, member)

#define __IRQ_STAT(cpu, member) (irq_stat[cpu].member)
//\Linux-5.4\include\linux\irq_cpustat.h file
DECLARE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat);	/* defined in asm/hardirq.h */
#define __IRQ_STAT(cpu, member)	(per_cpu(irq_stat.member, cpu))

//\Linux-5.4 \ include \ Linux \ percpu defs. H
 * Normal declaration and definition macros.
#define DECLARE_PER_CPU_SECTION(type, name, sec)			\
	extern __PCPU_ATTRS(sec) __typeof__(type) name

#define DECLARE_PER_CPU_ALIGNED(type, name)				\

Expand to get the following definitions:

 irq_cpustat_t irq_stat[NR_CPUS] ____cacheline_aligned;

Of which:

  1. NR_CPUS is the number of CPUs in the system.
  2. In this way, each CPU operates only its own interrupt statistics structure. Assuming that there is a CPU with id, it can only operate its own interrupt statistics structure irq_stat [id] (0 ≤ id ≤ NR_CPUS-1), so that each CPU does not affect each other.

Use open_ The softirq() function can register the processing function corresponding to the soft interrupt, and raise_ The softirq () function can start a software interrupt.

//Using softirq mechanism requires open_softirq to register the soft interrupt handler function so that the interrupt index number corresponds to the interrupt handler function. This function is defined in the kernel/softirq.c file.
//This function assigns the interrupt handling function pointer of the soft interrupt to the corresponding softirq_vec. 
//nr is the interrupt number and action is the interrupt processing function
void open_softirq(int nr, void (*action)(struct softirq_action *))
	softirq_vec[nr].action = action;
//After the interrupt handler completes the emergency hardware operation, raise should be called_ Softirq function to trigger the soft interrupt and let the soft interrupt handle the time-consuming operation in the lower half of the interrupt.
//If not, the thread softirqd is immediately awakened
void raise_softirq(unsigned int nr)		//Trigger a soft interrupt, nr is the interrupt number
	unsigned long flags;

	raise_softirq_irqoff(nr);	//raise_softirq_irqoff to trigger the corresponding soft interrupt and set the corresponding bit position to bit

raise_softirq_irqoff function:

 * This function must run with irqs disabled!
inline void raise_softirq_irqoff(unsigned int nr)

	 * If we're in an interrupt or softirq, we're done
	 * (this also catches softirq-disabled code). We will
	 * actually run the softirq once we return from
	 * the irq or softirq.
	 * Otherwise we wake up ksoftirqd to make sure we
	 * schedule the softirq soon.
	if (!in_interrupt())

Pass in_interrupt determines whether the interrupt is in the interrupt context or whether the soft interrupt is prohibited. If it is not true, we must call wakeup_ The softirqd function is used to wake up the softirqd kernel thread on the CPU.

2.3 initialize soft interrupt

//Initialize soft interrupt (softirq_init)
//At start_ Softirq is called when kernel () initializes the system_ Init() function on HI_SOFTIRQ and TASKLET_SOFTIRQ is initialized with two soft interrupts
//Linux-5.4\init\main.c -> softirq_init()
void __init softirq_init(void)
	int cpu;

	for_each_possible_cpu(cpu) {
		per_cpu(tasklet_vec, cpu).tail =
			&per_cpu(tasklet_vec, cpu).head;
		per_cpu(tasklet_hi_vec, cpu).tail =
			&per_cpu(tasklet_hi_vec, cpu).head;

	open_softirq(TASKLET_SOFTIRQ, tasklet_action);	//Set soft interrupt service function
	open_softirq(HI_SOFTIRQ, tasklet_hi_action);	//Set soft interrupt service function

2.4 execution and handling of soft interrupt service

2.4.1 return to site dispatching after interruption__ do_softirq

Triggering soft interrupt is the most common form in interrupt handler. After a hardware interrupt is processed. After processing the hardware interrupt, the following function exits the interrupt processing function and returns to IRQ_ The processing of software interrupt will be triggered in exit.

Interrupt handling model:

Hard interrupt processing:

fastcall unsigned int do_IRQ(struct pt_regs *regs)

        //handle external interrupt (ISR)

        return 1;

IRQ will be executed after hard interrupt processing_ Exit() function

 * Exit an interrupt context. Process softirqs if needed and possible:
void irq_exit(void)
	if (!in_interrupt() && local_softirq_pending())	//If it is not in the interrupt context and there is no flag bit, invoke is called_ Softirq function, calling__ do_softirq function, or wake up the thread handling soft interrupts.

	trace_hardirq_exit(); /* must be last! */

In IRQ_ Invoke is called in the exit() function_ Softirq function

static inline void invoke_softirq(void)
	if (ksoftirqd_running(local_softirq_pending()))

	if (!force_irqthreads) {
		 * We can safely execute softirq on the current stack if
		 * it is the irq stack, because it should be near empty
		 * at this stage.
		__do_softirq();	/* Soft interrupt processing function */
		 * Otherwise, irq_exit() is called on the task stack that can
		 * be potentially deep already. So call softirq in its own stack
		 * to prevent from any overrun.
		do_softirq_own_stack();	//__do_softirq();
	} else {
		wakeup_softirqd();/* If the soft interrupt thread is forced to be used for soft interrupt processing, the scheduler will be notified to wake up the soft interrupt thread ksoftirqd */

Function do_ Softirq() is responsible for executing the array softirq_ Soft interrupt service function set in VEC [i]. Each CPU executes the soft interrupt service by executing this function. Because soft interrupt service routines on the same CPU do not allow nesting, do_ The softirq() function checks whether the current CPU is out of the interrupt service at the beginning. If so, do_ The softirq() function returns immediately. For example, suppose CPU0 is executing do_ Softirq() function, the execution process generates a high priority hardware interrupt, so CPU0 turns to execute the interrupt service program corresponding to the high priority interrupt. As we all know, all interrupt service programs end up jumping to do_ The IRQ () function executes ISR in the interrupt service queue in turn. Here, we assume that the ISR request of this high priority interrupt triggers a soft interrupt, so do_ The IRQ () function sees a soft interrupt request before exiting and calls do_ Softirq() function to service the soft interrupt request. Therefore, CPU0 enters do again_ Softirq() function (i.e. do_softirq() function is re entered on CPU0). But this time into do_ When the softirq() function is, it immediately finds that CPU0 has been in the service interruption state before, so this time do_ The softirq() function returns immediately. Therefore, CPU0 returns to do at the beginning_ The softirq() function continues to execute and provides a service for the soft interrupt request triggered by the ISR of the high priority interrupt. As you can see here, do_ The softirq() function is executed serially on the same CPU.

do_softirq function:

asmlinkage __visible void do_softirq(void)
	__u32 pending;
	unsigned long flags;
	/* Judge whether it is in interrupt processing. If it is in interrupt processing, return directly */
	if (in_interrupt())
	/* Saves the value of the current register */
	/* Gets the bitmap of the currently registered soft interrupt */
	pending = local_softirq_pending();

    /* Loop through all registered soft interrupts */
	if (pending && !ksoftirqd_running(pending))
		do_softirq_own_stack();		//__do_softirq


__ do_softirq function:

asmlinkage __visible void __softirq_entry __do_softirq(void)
	unsigned long end = jiffies + MAX_SOFTIRQ_TIME;	/* In order to prevent the execution time of soft interrupt from being too long, a soft interrupt end time is set */
	unsigned long old_flags = current->flags;		 /* Flag to save the current process */
	int max_restart = MAX_SOFTIRQ_RESTART;			/* Execution times of soft interrupt cycle: 10 times */
	struct softirq_action *h;						/* action pointer of soft interrupt */
	bool in_hardirq; 
	__u32 pending;
	int softirq_bit;

	 * Mask out PF_MEMALLOC as the current task context is borrowed for the
	 * softirq. A softirq handled, such as network RX, might set PF_MEMALLOC
	 * again if the socket is related to swapping.
	current->flags &= ~PF_MEMALLOC;

	pending = local_softirq_pending();			/* Gets the of this CPU__ softirq_pengding variable value */
	account_irq_enter_time(current);			/* It is used to count the time when the process is used by soft interrupt */

	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);	/* Add preempt_count soft interrupt counter, which also indicates that scheduling is prohibited */
	in_hardirq = lockdep_softirq_start();

restart:										/* For the entry of 10 cycles, each cycle will execute all pending soft interrupts */
	/* Of this CPU__ softirq_pending reset, current__ softirq_pending is saved in the pending variable */
    /* This ensures that the new soft interrupt will be executed in the next loop */
	/* Reset the pending bitmask before enabling irqs */
	local_irq_enable();		/* Open interrupt */

	h = softirq_vec;		/* h Point to soft interrupt array header */

	while ((softirq_bit = ffs(pending))) {/* Pending soft interrupts that get the highest priority each time */
		unsigned int vec_nr;
		int prev_count;

		h += softirq_bit - 1;		/* Gets the address of this soft interrupt descriptor */

		vec_nr = h - softirq_vec;	/* Subtract the first address of the soft interrupt descriptor array to obtain the soft interrupt number */
		prev_count = preempt_count();	/* Get preempt_ Value of count */

		kstat_incr_softirqs_this_cpu(vec_nr);	/* Increase the number of soft interrupts in the statistics */

		h->action(h);							/* Execute the action operation of the soft interrupt */
		/* Previously saved preempt_count is not equal to the current preempt_ In the case of count, you can simply copy the previous to the current preempt_count to prevent the system from being unable to perform scheduling because the last soft interrupt count is not 0 */
		if (unlikely(prev_count != preempt_count())) {
			pr_err("huh, entered softirq %u %s %p with preempt_count %08x, exited with %08x?\n",
			       vec_nr, softirq_to_name[vec_nr], h->action,
			       prev_count, preempt_count());
		h++;/* h Point to the next soft interrupt, but the next soft interrupt does not necessarily need to be executed. Here, it is only used in conjunction with softirq_ Do one process */
		pending >>= softirq_bit;

	if (__this_cpu_read(ksoftirqd) == current)
	local_irq_disable();		/* Off interrupt */
	/* After the loop ends, get the CPU again__ softirq_ The pending variable is used to check whether there are any soft interrupts not executed */
	pending = local_softirq_pending();
	/* When there are still soft interrupts to be executed, if the time slice is not completed and the number of cycles is less than 10, continue to execute the soft interrupt */
	if (pending) {
		if (time_before(jiffies, end) && !need_resched() &&
			goto restart;
		/* There is a soft interrupt pending here, but the soft interrupt time and cycle times have been used up. Inform the scheduler to wake up the soft interrupt thread to execute the pending soft interrupt. The soft interrupt thread is ksoftirqd, which only serves as a notification, because scheduling is prohibited in the interrupt context */

	account_irq_exit_time(current);/* It is used to count the time when the process is used by soft interrupt */
	/* Reduce preempt_ Soft interrupt counter in count */
	/* Restore process flag */
	current_restore_flags(old_flags, PF_MEMALLOC);

2.4.2 execution in daemon thread ksoftirq

Although most softirq is executed in the case of interrupt exit, there are several cases that will be executed in ksoftirq.

  1. From the above, raise_softirq is triggered actively, and the ksoftirq process will wake up when it is not in the interrupt context.
  2. In IRQ_ Soft interrupt is executed in exit, but after max_ SOFTIRQ_ After the restart cycle, the soft interrupt has not been processed. In this case, the ksoftirq process will also be awakened.

Therefore, adding the daemon mechanism is mainly worried that once there are a large number of soft interrupts waiting for execution, the kernel will stay in the interrupt context too long.

static void wakeup_softirqd(void)
	/* Interrupts are disabled: no need to stop preemption */
	struct task_struct *tsk = __this_cpu_read(ksoftirqd);//ksoftirqd this thread

	if (tsk && tsk->state != TASK_RUNNING)

softirq_ The threads function is run_ksoftirqd:

//\In the kernel\softirq.c file
static struct smp_hotplug_thread softirq_threads = {
	.store			= &ksoftirqd,
	.thread_should_run	= ksoftirqd_should_run,
	.thread_fn		= run_ksoftirqd,
	.thread_comm		= "ksoftirqd/%u",

run_ksoftirqd function:

static void run_ksoftirqd(unsigned int cpu)
	if (local_softirq_pending()) {
		 * We can safely run softirq on inline stack, as we are not deep
		 * in the task stack here.
		__do_softirq();		//Finally, this function is called.

3, Small task mechanism

tasklet mechanism is a special soft interrupt.

The original meaning of the word tasklet is "small piece of task". Here, it refers to a small piece of executable code, which usually appears in the form of a function. Soft interrupt vector HI_SOFTIRQ and TASKLET_SOFTIRQ is implemented by tasklet mechanism.

To some extent, tasklet mechanism is an extension of BH mechanism by Linux kernel. After softirq mechanism is introduced into 2.4 kernel, the original BH mechanism brings softirq mechanism into the overall framework through the bridge of tasklet mechanism. It is precisely because of this historical extension that tasklet mechanism is different from soft interrupt in the general sense, and presents the following two significant characteristics:

  1. Different from general soft interrupts, a piece of tasklet code can only run on one CPU at a time, unlike general soft interrupt service functions (i.e. action function pointer in softirq_action structure), which can be executed concurrently by multiple CPUs at the same time.
  2. Different from the BH mechanism, different tasklet codes can be executed concurrently on multiple CPUs at the same time, unlike the BH mechanism, which must be executed strictly serially (that is, only one CPU in the system can execute BH functions at the same time).

3.1 tasklet descriptor

Data structure tasklet for Linux_ Struct to describe a tasklet, and each structure represents an independent small task. The data structure is defined in the include/linux/interrupt.h header file. As follows:

/* Tasklets --- multithreaded analogue of BHs.

   Main feature differing them of generic softirqs: tasklet
   is running only on one CPU simultaneously.

   Main feature differing them of BHs: different tasklets
   may be run simultaneously on different CPUs.

   * If tasklet_schedule() is called, then tasklet is guaranteed
     to be executed on some cpu at least once after this.
   * If the tasklet is already scheduled, but its execution is still not
     started, it will be executed only once.
   * If this tasklet is already running on another CPU (or schedule is called
     from tasklet itself), it is rescheduled for later.
   * Tasklet is strictly serialized wrt itself, but not
     wrt another tasklets. If client needs some intertask synchronization,
     he makes it with spinlocks.

struct tasklet_struct
	struct tasklet_struct *next;
	unsigned long state;
	atomic_t count;
	void (*func)(unsigned long);
	unsigned long data;

Of which:

  • Next: pointer to the next tasklet;

  • State: defines the current state of the tasklet. This is a 32-bit unsigned long integer. Currently, only two status bits bit[1] and bit[0] are used. Where, bit[1] = 1 indicates that the tasklet is currently being executed on a CPU. It is only meaningful to the SMP system. Its function is to prevent multiple CPUs from executing a tasklet at the same time; bit[0] = 1 indicates that the tasklet has been scheduled for execution.

The macro definition of these two status bits is as follows (interrupt.h):

	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
  • Count: sub count, the reference count value of this tasklet.

Note: the tasklet code segment can be executed only when count is equal to 0, that is, the tasklet is enabled at this time; If count is non-zero, the tasklet is prohibited. Anyone who wants to execute a tasklet code segment must first check whether its count member is 0.

  • func: points to the executable tasklet code segment in the form of a function.
  • Data: parameter of func function. This is a 32-bit unsigned integer whose specific meaning can be interpreted by func function, such as an address value pointing to a user-defined data structure.

Linux defines two more tasklets in the interrupt.h header file_ Auxiliary macro of struct structure variable:

//Define a tasklet_struct structure
#define DECLARE_TASKLET(name, func, data) \
struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(0), func, data }

#define DECLARE_TASKLET_DISABLED(name, func, data) \
struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }

Obviously, as can be seen from the above source code, use declare_ The tasklet defined by the tasklet macro is enabled during initialization because its count member is 0. Instead, use DECLARE_TASKLET_ The tasklet defined by the disabled macro is initially disabled because its count is equal to 1.

3.2 changing the status of a tasklet

Here, tasklet status refers to two aspects:

  1. State: the running state represented by the member;
  2. count: enable / disable status determined by the member.

3.2.1 changing the running state of a tasklet

bit[0] in the state member indicates whether a tasklet has been scheduled for execution, and bit[1] indicates whether a tasklet is executing on a CPU. Changing a bit in the state variable must be an atomic operation, so you can use the bit operation defined in the include/asm/bitops.h header file.
Since the bit[1] (i.e. TASKLET_STATE_RUN) is only meaningful to the SMP system, Linux explicitly defines tasklet in the Interrupt.h header file_ STATE_ Operation of the run bit. As follows:

static inline int tasklet_trylock(struct tasklet_struct *t)
	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
static inline void tasklet_unlock(struct tasklet_struct *t)
	clear_bit(TASKLET_STATE_RUN, &(t)->state);
static inline void tasklet_unlock_wait(struct tasklet_struct *t)
	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
#define tasklet_trylock(t) 1
#define tasklet_unlock_wait(t) do { } while (0)
#define tasklet_unlock(t) do { } while (0)

Obviously, the SMP system is the same as tasklet_ The trylock () macro will put a tasklet_ The bit[1] bit in the state member in the struct structure variable is set to 1, and the non bit of bit[1] bit is also returned. Therefore, if the original value of bit[1] bit is 1 (indicating that another CPU is executing the tasklet code), then the tasklet_ The trylock() macro will return a value of 0, which means that the locking is unsuccessful. If the original value of bit[1] is 0, then tasklet_ The trylock() macro will return a value of 1, indicating that locking is successful. In a single CPU system, tasklet_ The trylock() macro always returns 1.
Any program that wants to execute a tasklet code must first call the macro tasklet_ Try lock() to lock the tasklet (that is, set the TASKLET_STATE_RUN bit), and the tasklet can only be executed if the locking is successful. Suggestion! Even if your program runs on the CPU system, you have to call tasklet_ before you execute tasklet. Trylock () macro to make your code portable.
In SMP system, tasklet_ unlock_ The wait () macro will keep testing tasklet_ STATE_ The value of run bit until the value of this bit becomes 0 (i.e. wait until unlocking). If CPU0 is executing the code of tasklet A, CPU1 also wants to execute the code of tasklet A, but CPU1 finds the tasklet of tasklet A_ STATE_ The run bit is 1, so it can be passed through the tasklet_ unlock_ The wait() macro waits for tasklet A to be unlocked (that is, the TASKLET_STATE_RUN bit is cleared). In a single CPU system, this is an empty operation.
Macro tasklet_unlock() is used to unlock a tasklet, that is, tasklet_ STATE_ The run bit is cleared. In a single CPU system, this is an empty operation.

3.2.2 enable / disable a tasklet

Enabling and disabling operations are often called in pairs_ The disable() function is as follows (interrupt.h):

static inline void tasklet_disable(struct tasklet_struct *t)

Function tasklet_disable_nosync() is also a static inline function, which simply increases the value of the count member variable by 1 through atomic operation. As follows (interrupt.h):

static inline void tasklet_disable_nosync(struct tasklet_struct *t)

Function tasklet_enable() is used to enable a tasklet, as shown below (interrupt.h):

static inline void tasklet_enable(struct tasklet_struct *t)

3.3 initialization and execution of tasklet descriptor

Function tasklet_init() is used to initialize a specified tasklet descriptor. Its source code is as follows (kernel/softirq.c):

void tasklet_init(struct tasklet_struct *t,
		  void (*func)(unsigned long), unsigned long data)
	t->next = NULL;
	t->state = 0;
	atomic_set(&t->count, 0);
	t->func = func;
	t->data = data;

Function tasklet_kill() is used to kill a scheduled tasklet and restore it to an unscheduled state. The source code is as follows (kernel/softirq.c):

void tasklet_kill(struct tasklet_struct *t)
	if (in_interrupt())
		pr_notice("Attempt to kill tasklet from interrupt\n");

	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
		do {
		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
	clear_bit(TASKLET_STATE_SCHED, &t->state);

3.4 tasklet alignment

Multiple tasklets can be linked into a one-way column through the next member pointer in the tasklet descriptor. To this end, Linux specifically defines the data structure tasklet in the header file include/linux/interrupt.h_ Head to describe the header pointer of a tasklet to the column. As follows:

//\In the Linux-5.4\kernel\softirq.c file
 * Tasklets
struct tasklet_head
	struct tasklet_struct *head;
	struct tasklet_struct **tail;

Although the tasklet mechanism is specific to the soft interrupt vector HI_SOFTIRQ and TASKLET_SOFTIRQ is an implementation of softirq, but the tasklet mechanism still belongs to the overall framework of softirq mechanism. Therefore, its design and implementation must still adhere to the idea of "who triggers, who executes". Therefore, Linux defines a tasklet column header for each CPU in the system to represent the tasklet column that should be executed by each CPU. As follows (kernel/softirq.c):

 * Variant on the per-CPU variable declaration/definition theme used for
 * ordinary per-CPU variables.
#define DECLARE_PER_CPU(type, name)					\
	DECLARE_PER_CPU_SECTION(type, name, "")

#define DEFINE_PER_CPU(type, name)					\
	DEFINE_PER_CPU_SECTION(type, name, "")
static DEFINE_PER_CPU(struct tasklet_head, tasklet_vec);
static DEFINE_PER_CPU(struct tasklet_head, tasklet_hi_vec);

Expand above to obtain:

struct tasklet_head tasklet_vec[NR_CPUS] __cacheline_aligned;
struct tasklet_head tasklet_hi_vec[NR_CPUS] __cacheline_aligned;

Where, tasklet_vec [] array is used for soft interrupt vector TASKLET_SOFTIRQ, and tasklet_hi_vec [] array is used for soft interrupt vector HI_SOFTIRQ. That is, if CPUi (0 ≤ I ≤ NR_CPUS-1) triggers the soft interrupt vector TASKLET_SOFTIRQ, then for the column tasklet_ Each tasklet in VEC [i] will serve the soft interrupt vector tasklet in CPUi_ Softirq is executed by CPUi. Similarly, if CPUi (0 ≤ I ≤ NR_CPUS-1) triggers the soft interrupt vector HI_SOFTIRQ, then queue tasklet_ hi_ Each tasklet in VEC [i] sets CPUi to soft interrupt vector HI_SOFTIRQ is executed by CPUi during service.
Queue tasklet_vec [I] and tasklet_ hi_ How are the tasklets in VEC [I] executed by CPUi? The key is the soft interrupt vector TASKLET_SOFTIRQ and hi_ Soft interrupt service program of softirq -- tasklet_action() function and tasklet_hi_action() function. Let's analyze these two functions.

3.5 soft interrupt vector TASKLET_SOFTIRQ and HI_SOFTIRQ

Linux is a soft interrupt vector TASKLET_SOFTIRQ and HI_SOFTIRQ implements special trigger function and soft interrupt service function.

  • Special trigger function

tasklet_schedule() function and tasklet_ hi_ The schedule() function is used to trigger the soft interrupt vector tasklet on the current CPU_ Softirq and HI_SOFTIRQ, and add the specified tasklet to the tasklet queue corresponding to the current CPU to wait for execution.

  • Dedicated soft interrupt service function

tasklet_action() function and tasklet_ hi_ The action () function is the soft interrupt vector TASKLET_SOFTIRQ and HI_SOFTIRQ soft interrupt service function. In the initialization function softirq_init(), the descriptor softirq corresponding to the two soft interrupt vectors_ VEC [0] and softirq_ The action function pointers in VEC [6] are respectively initialized to point to the tasklet function_ hi_ Action () and tasklet_action().

3.5.1 soft interrupt vector tasklet_ Trigger function tasklet of softirq_ schedule

This function is implemented in the include/linux/interrupt.h header file and is an inline function. The source code is as follows:

static inline void tasklet_schedule(struct tasklet_struct *t)
	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
void __tasklet_schedule(struct tasklet_struct *t)
	__tasklet_schedule_common(t, &tasklet_vec,
static void __tasklet_schedule_common(struct tasklet_struct *t,
				      struct tasklet_head __percpu *headp,
				      unsigned int softirq_nr)
	struct tasklet_head *head;
	unsigned long flags;

	head = this_cpu_ptr(headp);
	t->next = NULL;
	*head->tail = t;
	head->tail = &(t->next);
  • Call test_ and_ set_ The bit() function sets the bit [0] (i.e. TASKLET_STATE_SCHED bit) of the state member variable of the tasklet to be scheduled to 1. The function also returns tasklet_ STATE_ The original value of the sched bit. Therefore, if the original value of bit [0] is already 1, it means that the tasklet has been scheduled to another CPU for execution. Because a tasklet can only be executed by one CPU at a certain time, the tasklet_ The schedule () function returns without doing anything. Otherwise, continue the following scheduling operation.

  • First, call local_ irq_ The save () function closes the interrupt of the current CPU to ensure that the following steps are executed atomically on the current CPU.

  • Then, add the tasklet to be scheduled to the tail of the tasklet queue corresponding to the current CPU.

  • Next, raise is called_ softirq_ The irqoff function triggers a soft interrupt request tasklet on the current CPU_ SOFTIRQ.

  • Finally, call local_. irq_ Restore() function to open the interrupt of the current CPU.

3.5.2 soft interrupt vector tasklet_ Service program tasklet of softirq_ action

Function tasklet_action() is tasklet mechanism and soft interrupt vector tasklet_ The link of softirq. It is this function that puts each tasklet in the tasklet queue of the current CPU on the current CPU for execution. The function is implemented in the kernel/softirq.c file, and its source code is as follows:

static __latent_entropy void tasklet_action(struct softirq_action *a)
	tasklet_action_common(a, this_cpu_ptr(&tasklet_vec), TASKLET_SOFTIRQ);

static void tasklet_action_common(struct softirq_action *a,
				  struct tasklet_head *tl_head,
				  unsigned int softirq_nr)
	struct tasklet_struct *list;

	list = tl_head->head;
	tl_head->head = NULL;
	tl_head->tail = &tl_head->head;

	while (list) {
		struct tasklet_struct *t = list;

		list = list->next;

		if (tasklet_trylock(t)) {
			if (!atomic_read(&t->count)) {
				if (!test_and_clear_bit(TASKLET_STATE_SCHED,

		t->next = NULL;
		*tl_head->tail = t;
		tl_head->tail = &t->next;
  • First, when the current CPU is interrupted, read the tasklet queue header pointer of the current CPU "atomically" and save it to the local variable list pointer, and then set the tasklet queue header pointer of the current CPU to NULL to indicate that the current CPU will no longer have tasklets to execute in theory (but the actual result is not necessarily the case, as you will see below).
  • Then, a while {} loop is used to traverse the tasklet queue pointed by the list. Each element in the queue is the tasklet to be executed on the current CPU. The execution steps of the loop body are as follows:
  • The pointer t is used to represent the current queue element, that is, the tasklet to be executed.
  • Update the list pointer to list - > next to point to the next tasklet to be executed.
  • Using tasklet_ The trylock () macro attempts to lock the tasklet to be executed (pointed by the pointer t). If the locking is successful (no other CPU is currently executing the tasklet), the atomic read function atomic is used_ Read () further determines the value of the count member. If count is 0, the tasklet is allowed to be executed, so:
  1. Clear tasklet first_ STATE_ Sched bit;
  2. Then, call the executable function func of this tasklet;
  3. Call macro tasklet_unlock() to clear TASKLET_STATE_RUN bit;
  4. Finally, execute the continue statement, skip the following steps, return to the while loop, and continue to traverse the next element in the queue. If the count is not 0, the tasklet is forbidden to run, so the tasklet is called_ Unlock() clears the previously used tasklet_ Tasklet set by trylock()_ STATE_ Run bit.

3.6 tasklet Usage Summary

1. Declare and use small tasks. In most cases, in order to control a common hardware device, the small task mechanism is the best choice to implement the lower half. Small tasks can be created dynamically, which is easy to use and fast to execute. We can create a small task either statically or dynamically. The choice depends on whether you want to make a direct reference or an indirect reference to the small task. If you are going to create a small task statically (that is, directly reference it), use one of the following two macros:
DECLARE_TASKLET(name,func, data)
Both macros can statically create a tasklet based on a given name_ Struct structure. After the small task is scheduled, the given function func will be executed, and its parameters are given by data. The difference between the two macros is that the initial value setting of the reference counter is different. The first macro sets the reference counter of the created widget to 0, so the widget is active. The other set the reference counter to 1, so the small task is disabled. For example:
DECLARE_TASKLET(my_tasklet,my_tasklet_handler, dev);
This line of code is actually equivalent to
struct tasklet_struct my_tasklet = { NULL, 0, ATOMIC_INIT(0),tasklet_handler,dev};
This creates a file named my_tasklet is a small task whose handler is tasklet_handler and has been activated. When the handler is called, dev is passed to it.

2. Write your own small task handler. The small task handler must meet the following function types:

void tasklet_handler(unsigned long data)
Since small tasks cannot sleep, semaphores or other blocking functions cannot be used in small tasks. However, small tasks can respond to interrupts when running.

3. Schedule your own small tasks by calling tasklet_ The schedule () function and pass it the corresponding tasklt_struct pointer, the small task will be scheduled for execution at an appropriate time:
tasklet_ schedule(&my_tasklet); /* Put my_ The tasklet is marked as pending*/
After the small task is scheduled, it will run as early as possible whenever possible. Before it gets the chance to run, if the same small task is scheduled again, it will still run only once.

You can call tasklet_disable() function to disable a specified small task. If the small task is currently executing, this function will wait until it is completed before returning. Call tasklet_ The enable () function can activate a small task. If you want to set it as declare_ TASKLET_ When the small task created by disabled() is activated, this function must also be called, such as:

tasklet_ disable(&my_tasklet); / The small task is now disabled. This small task cannot run/

tasklet_ enable(&my_tasklet); /* The small task is now activated*/

You can also call tasklet_ The kill () function removes a small task from the pending queue. The parameter of this function is a tasklet pointing to a small task_ Long pointer to struct. When a small task reschedules itself, it is useful to remove the scheduled small task from the suspended queue. This function first waits for the execution of the small task, and then moves it.

4. Simple usage of tasklet

The following is a simple application of tasklet to form and load modules.

#include <linux/module.h>
#include <linux/init.h>
#include <linux/fs.h>
#include <linux/kdev_t.h>
#include <linux/cdev.h>
#include <linux/kernel.h>
#include <linux/interrupt.h>
static struct  t asklet_struct my_tasklet;
static void tasklet_handler (unsigned long d ata)
        printk(KERN_ALERT,"tasklet_handler is running./n");
static int __init test_init(void)
static  void __exit test_exit(void)
        printk(KERN_ALERT,"test_exit is running./n");

It can be seen from this example that the so-called small task mechanism provides an execution mechanism for the execution of the lower half of the function, that is, the delayed processing is handled by the tasklet_ When the handler is implemented, it is encapsulated by the small task mechanism and handed over to the kernel for processing.

4, Work queue mechanism for interrupt processing

Work queue is another form of delaying work execution, which is different from the tasklet discussed earlier. The work queue can push back the work and hand it over to a kernel thread for execution, that is, the lower part can be executed in the process context. In this way, code executed through the work queue can take all the advantages of the process context. The most important thing is that work queues are allowed to be rescheduled or even sleep.
So, when to use work queues and tasklets. If the delayed task needs sleep, select the work queue; If the delayed task does not require sleep, select tasklet. In addition, if you need to use an entity that can be rescheduled to perform the lower half of your processing, you should also use work queues. It is the only mechanism that can run in the lower half of the process context, and only it can sleep. This means that it can be very useful when you need to get a lot of memory, when you need to get semaphores, and when you need to perform blocking I/O operations. If you don't need a kernel thread to push back work, consider using a tasklet.

4.1 work queue

As mentioned earlier, we call the deferred task work, and describe its data structure as work_struct, these work queues are organized into work queues in a queue structure, and their data structure is workqueue_struct, and the worker thread is responsible for executing the work in the work queue. The default worker thread of the system is events. You can also create your own worker thread. The data structure representing the work uses the work defined in < Linux / workqueue. H >_ The struct structure represents:

struct work_struct {
	atomic_long_t data;
	struct list_head entry;	// Linked list of all jobs
	work_func_t func; 		// Function to execute
	struct lockdep_map lockdep_map;
typedef void (*work_func_t)(struct work_struct *work);

These structures are linked into linked lists. When a worker thread is awakened, it will perform all the work on its linked list. After the work is executed, it will the corresponding work_ The struct object is removed from the linked list. When there are no more objects on the linked list, it will continue to sleep. The data structure representing the work queue uses the workqueue defined in < kernel / workqueue. C >_ struct:

 * The externally visible workqueue.  It relays the issued work items to
 * the appropriate worker_pool through its pool_workqueues.
struct workqueue_struct {
	struct list_head	pwqs;		/* WR: all pwqs of this wq */
	struct list_head	list;		/* PR: list of all workqueues */

	struct mutex		mutex;		/* protects this wq */
	int			work_color;	/* WQ: current work color */
	int			flush_color;	/* WQ: current flush color */
	atomic_t		nr_pwqs_to_flush; /* flush in progress */
	struct wq_flusher	*first_flusher;	/* WQ: first flusher */
	struct list_head	flusher_queue;	/* WQ: flush waiters */
	struct list_head	flusher_overflow; /* WQ: flush overflow list */

	struct list_head	maydays;	/* MD: pwqs requesting rescue */
	struct worker		*rescuer;	/* I: rescue worker */

	int			nr_drainers;	/* WQ: drain in progress */
	int			saved_max_active; /* WQ: saved pwq max_active */

	struct workqueue_attrs	*unbound_attrs;	/* PW: only for unbound wqs */
	struct pool_workqueue	*dfl_pwq;	/* PW: only for unbound wqs */

	struct wq_device	*wq_dev;	/* I: for sysfs interface */
	char			*lock_name;
	struct lock_class_key	key;
	struct lockdep_map	lockdep_map;
	char			name[WQ_NAME_LEN]; /* I: workqueue name */

	 * Destruction of workqueue_struct is RCU protected to allow walking
	 * the workqueues list without grabbing wq_pool_mutex.
	 * This is used to dump all workqueues from sysrq.
	struct rcu_head		rcu;

	/* hot fields used during command issue, aligned to cacheline */
	unsigned int		flags ____cacheline_aligned; /* WQ: WQ_* flags */
	struct pool_workqueue __percpu *cpu_pwqs; /* I: per-cpu pwqs */
	struct pool_workqueue __rcu *numa_pwq_tbl[]; /* PWR: unbound pwqs indexed by node */

4.2 create deferred work

4.2.1 create work_struct statically

To use a work queue, the first thing to do is to create some work that needs to be done later. You can use DECLARE_WORK statically builds this structure at compile time:

DECLARE_WORK(name, func); Or INIT_WORK(_work, _func)

It is defined as follows:

#define DECLARE_WORK(n, f)					\
	struct work_struct n = __WORK_INITIALIZER(n, f)

Examples are as follows:

static void do_poweroff(struct work_struct *dummy)
static DECLARE_WORK(poweroff_work, do_poweroff);

That is, a global static variable is created: static work_struct poweroff_work, which is initialized, and its execution function is do_poweroff.

4.2.2 work_struct

First define a struct work_struct variable to call init when needed_ Work is initialized and can then be used.

#define INIT_WORK(_work, _func)					\
	do {							\
		__INIT_WORK((_work), (_func), 0);		\
	} while (0)

give an example:

void __cfg80211_scan_done(struct work_struct *wk)
	struct cfg80211_registered_device *rdev;
	rdev = container_of(wk, struct cfg80211_registered_device,
	___cfg80211_scan_done(rdev, false);
struct cfg80211_registered_device {
	struct work_struct scan_done_wk;
	struct work_struct sched_scan_results_wk;
	struct work_struct conn_work;
	struct work_struct event_work;
	struct cfg80211_wowlan *wowlan;
struct cfg80211_registered_device *rdev;
rdev = kzalloc(alloc_size, GFP_KERNEL);
INIT_WORK(&rdev->scan_done_wk, __cfg80211_scan_done);  // Its execution function is:__ cfg80211_scan_done
INIT_WORK(&rdev->sched_scan_results_wk, __cfg80211_sched_scan_results);

4.3 scheduling of work

Now that the job has been created, we can schedule it. To submit the pending function of a given job to the default events worker thread, just call: int schedule_work(struct work_struct *work);
It puts work into the global work queue: system_wq, which is defined as follows:

 * schedule_work - put work task in global workqueue
 * @work: job to be done
 * Returns %false if @work was already on the kernel-global workqueue and
 * %true otherwise.
 * This puts a job in the kernel-global workqueue if it was not already
 * queued and leaves it in the same position on the kernel-global
 * workqueue otherwise.
static inline bool schedule_work(struct work_struct *work)
	return queue_work(system_wq, work);
/* System-wide workqueues which are always present.
 * system_wq is the one used by schedule[_delayed]_work[_on]().
 * Multi-CPU multi-threaded.  There are users which expect relatively
 * short queue flush time.  Don't queue works which can run for too
 * long.
extern struct workqueue_struct *system_wq;

queue_work: put a work into the work queue:

 * queue_work - queue work on a workqueue
 * @wq: workqueue to use
 * @work: work to queue
 * Returns %false if @work was already on a queue, %true otherwise.
 * We queue the work to the CPU on which it was submitted, but if the CPU dies
 * it can be processed by another CPU.
static inline bool queue_work(struct workqueue_struct *wq,
			      struct work_struct *work)
	return queue_work_on(WORK_CPU_UNBOUND, wq, work);

 * queue_work_on - queue work on specific cpu
 * @cpu: CPU number to execute work on
 * @wq: workqueue to use
 * @work: work to queue
 * We queue the work to a specific CPU, the caller must ensure it
 * can't go away.
 * Return: %false if @work was already on a queue, %true otherwise.
bool queue_work_on(int cpu, struct workqueue_struct *wq,
		   struct work_struct *work)
	bool ret = false;
	unsigned long flags;


	if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
		__queue_work(cpu, wq, work);
		ret = true;

	return ret;

Put the work into the work queue, and the work will be scheduled immediately. Once the worker thread on its processor is awakened, it will be executed. Sometimes you don't want the work to be executed immediately, but you want it to be executed after a period of delay. In this case, it can be scheduled to execute at a specified time:

 * schedule_delayed_work - put work task in global workqueue after delay
 * @dwork: job to be done
 * @delay: number of jiffies to wait or 0 for immediate execution
 * After waiting for a given time this puts a job in the kernel-global
 * workqueue.
static inline bool schedule_delayed_work(struct delayed_work *dwork,
					 unsigned long delay)
	return queue_delayed_work(system_wq, dwork, delay);

#define DECLARE_DELAYED_WORK(n, f)					\
	struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, 0)

#define INIT_DELAYED_WORK(_work, _func)					\
	__INIT_DELAYED_WORK(_work, _func, 0)

4.4 creating worker threads

After the work is put into the work queue, the workers who manage the work queue execute these work through alloc_workqueue or create_singlethread_workqueue to create worker threads, and it finally calls kthread_. Create creates a thread named alloc_ The name specified in workqueue, for example:

 * workqueue_init_early - early init for workqueue subsystem
 * This is the first half of two-staged workqueue subsystem initialization
 * and invoked as soon as the bare basics - memory allocation, cpumasks and
 * idr are up.  It sets up all the data structures and system workqueues
 * and allows early boot code to create workqueues and queue/cancel work
 * items.  Actual work item execution starts only after kthreads can be
 * created and scheduled right before early initcalls.
int __init workqueue_init_early(void)
	int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
	int i, cpu;

	WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));

	BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
	cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(hk_flags));

	pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC);

	/* initialize CPU pools */
	for_each_possible_cpu(cpu) {
		struct worker_pool *pool;

		i = 0;
		for_each_cpu_worker_pool(pool, cpu) {
			pool->cpu = cpu;
			cpumask_copy(pool->attrs->cpumask, cpumask_of(cpu));
			pool->attrs->nice = std_nice[i++];
			pool->node = cpu_to_node(cpu);

			/* alloc pool ID */

	/* create default unbound and ordered wq attrs */
	for (i = 0; i < NR_STD_WORKER_POOLS; i++) {
		struct workqueue_attrs *attrs;

		BUG_ON(!(attrs = alloc_workqueue_attrs()));
		attrs->nice = std_nice[i];
		unbound_std_wq_attrs[i] = attrs;

		 * An ordered wq should have only one pwq as ordering is
		 * guaranteed by max_active which is enforced by pwqs.
		 * Turn off NUMA so that dfl_pwq is used for all nodes.
		BUG_ON(!(attrs = alloc_workqueue_attrs()));
		attrs->nice = std_nice[i];
		attrs->no_numa = true;
		ordered_wq_attrs[i] = attrs;

	system_wq = alloc_workqueue("events", 0, 0);
	system_highpri_wq = alloc_workqueue("events_highpri", WQ_HIGHPRI, 0);
	system_long_wq = alloc_workqueue("events_long", 0, 0);
	system_unbound_wq = alloc_workqueue("events_unbound", WQ_UNBOUND,
	system_freezable_wq = alloc_workqueue("events_freezable",
					      WQ_FREEZABLE, 0);
	system_power_efficient_wq = alloc_workqueue("events_power_efficient",
					      WQ_POWER_EFFICIENT, 0);
	system_freezable_power_efficient_wq = alloc_workqueue("events_freezable_power_efficient",
	BUG_ON(!system_wq || !system_highpri_wq || !system_long_wq ||
	       !system_unbound_wq || !system_freezable_wq ||
	       !system_power_efficient_wq ||

	return 0;

 * workqueue_init - bring workqueue subsystem fully online
 * This is the latter half of two-staged workqueue subsystem initialization
 * and invoked as soon as kthreads can be created and scheduled.
 * Workqueues have been created and work items queued on them, but there
 * are no kworkers executing the work items yet.  Populate the worker pools
 * with the initial workers and enable future kworker creations.
int __init workqueue_init(void)
	struct workqueue_struct *wq;
	struct worker_pool *pool;
	int cpu, bkt;

	 * It'd be simpler to initialize NUMA in workqueue_init_early() but
	 * CPU to node mapping may not be available that early on some
	 * archs such as power and arm64.  As per-cpu pools created
	 * previously could be missing node hint and unbound pools NUMA
	 * affinity, fix them up.
	 * Also, while iterating workqueues, create rescuers if requested.


	for_each_possible_cpu(cpu) {
		for_each_cpu_worker_pool(pool, cpu) {
			pool->node = cpu_to_node(cpu);

	list_for_each_entry(wq, &workqueues, list) {
		wq_update_unbound_numa(wq, smp_processor_id(), true);
		     "workqueue: failed to create early rescuer for %s",


	/* create the initial workers */
	for_each_online_cpu(cpu) {
		for_each_cpu_worker_pool(pool, cpu) {
			pool->flags &= ~POOL_DISASSOCIATED;

	hash_for_each(unbound_pool_hash, bkt, pool, hash_node)

	wq_online = true;

	return 0;

For example: cfg80211_wq = create_singlethread_workqueue(“cfg80211”); A kernel thread named cfg80211 is created.

4.5 simple application of work queue

#include <linux/module.h> 
#include <linux/init.h> 
#include <linux/workqueue.h> 
static struct workqueue_struct *queue = NULL; 
static struct work_struct work; 
static void work_handler(struct work_struct *data) 
  	printk(KERN_ALERT "work handler function.\n"); 

static int __init test_init(void) 
	queue = create_singlethread_workqueue("helloworld"); 
	/*Create a single threaded work queue*/         
	if (!queue)                 
		goto err;         
	INIT_WORK(&work, work_handler);         
	return 0; 
	return -1; 
static void __exit test_exit(void) {

reference resources

[1] https://blog.csdn.net/myarrow/article/details/9287169

[2] https://blog.csdn.net/yhb1047818384/article/details/63687126

Tags: Linux Embedded system

Posted on Tue, 05 Oct 2021 13:54:30 -0400 by tisource