[PATCH v3 3/3] Make finish_task_switch and its subfuncs inline in context switching

Thomas Gleixner tglx at linutronix.de
Fri Nov 14 12:00:43 PST 2025


On Thu, Nov 13 2025 at 18:52, Xie Yuanbin wrote:

What are subfuncs? This is not a SMS service. Use proper words and not
made up abbreviations.

> `finish_task_switch` is a hot path in context switching, and due to

Same comment as before about functions....

> possible mitigations inside switch_mm, performance here is greatly
> affected by function calls and branch jumps. Make it inline to optimize
> the performance.

Again you mark them __always_inline and not inline. Most of them are
already 'inline'. Can you please precise in your wording?

> After `finish_task_switch` is changed to an inline function, the number of
> calls to the subfunctions (called by `finish_task_switch`) increases in
> this translation unit due to the inline expansion of `finish_task_switch`.
> Due to compiler optimization strategies, these functions may transition
> from inline functions to non inline functions, which can actually lead to
> performance degradation.

I'm having a hard time to understand this word salad.

> Make the subfunctions of finish_task_stwitch inline to prevent
> degradation.
>
> Perf test:
> Time spent on calling finish_task_switch (rdtsc):

What means (rdtsc)? 

>  | compiler && appended cmdline | without patch   | with patch    |
>  | gcc + NA                     | 13.93 - 13.94   | 12.39 - 12.44 |

What is NA and what are the time units of this?

>  | gcc + "spectre_v2_user=on"   | 24.69 - 24.85   | 13.68 - 13.73 |
>  | clang + NA                   | 13.89 - 13.90   | 12.70 - 12.73 |
>  | clang + "spectre_v2_user=on" | 29.00 - 29.02   | 18.88 - 18.97 |

So the real benefit is observable when spectre_v2_user mitigations are
enabled. You completely fail to explain that.

> Perf test info:
> 1. kernel source:
> linux-next
> commit 9c0826a5d9aa4d52206d ("Add linux-next specific files for 20251107")
> 2. compiler:
> gcc: gcc version 15.2.0 (Debian 15.2.0-7) with
> GNU ld (GNU Binutils for Debian) 2.45
> clang: Debian clang version 21.1.4 (8) with
> Debian LLD 21.1.4 (compatible with GNU linkers)
> 3. config:
> base on default x86_64_defconfig, and setting:
> CONFIG_HZ=100
> CONFIG_DEBUG_ENTRY=n
> CONFIG_X86_DEBUG_FPU=n
> CONFIG_EXPERT=y
> CONFIG_MODIFY_LDT_SYSCALL=n
> CONFIG_CGROUPS=n
> CONFIG_BLK_DEV_NVME=y

This really can go into the comment section below the first '---'
separator. No point in having this in the change log.

> Size test:
> bzImage size:
>  | compiler | without patches | with patches  |
>  | clang    | 13722624        | 13722624      |
>  | gcc      | 12596224        | 12596224      |

bzImage size is completely irrelevant. What's interesting is how the
size of the actual function changes.

> Size test info:
> 1. kernel source && compiler: same as above
> 2. config:
> base on default x86_64_defconfig, and setting:
> CONFIG_SCHED_CORE=y
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y
> CONFIG_NO_HZ_FULL=y

And again, we all know how to build a kernel.




More information about the linux-riscv mailing list