[PATCH v3 3/3] Make finish_task_switch and its subfuncs inline in context switching
Thomas Gleixner
tglx at linutronix.de
Fri Nov 14 12:00:43 PST 2025
On Thu, Nov 13 2025 at 18:52, Xie Yuanbin wrote:
What are subfuncs? This is not a SMS service. Use proper words and not
made up abbreviations.
> `finish_task_switch` is a hot path in context switching, and due to
Same comment as before about functions....
> possible mitigations inside switch_mm, performance here is greatly
> affected by function calls and branch jumps. Make it inline to optimize
> the performance.
Again you mark them __always_inline and not inline. Most of them are
already 'inline'. Can you please precise in your wording?
> After `finish_task_switch` is changed to an inline function, the number of
> calls to the subfunctions (called by `finish_task_switch`) increases in
> this translation unit due to the inline expansion of `finish_task_switch`.
> Due to compiler optimization strategies, these functions may transition
> from inline functions to non inline functions, which can actually lead to
> performance degradation.
I'm having a hard time to understand this word salad.
> Make the subfunctions of finish_task_stwitch inline to prevent
> degradation.
>
> Perf test:
> Time spent on calling finish_task_switch (rdtsc):
What means (rdtsc)?
> | compiler && appended cmdline | without patch | with patch |
> | gcc + NA | 13.93 - 13.94 | 12.39 - 12.44 |
What is NA and what are the time units of this?
> | gcc + "spectre_v2_user=on" | 24.69 - 24.85 | 13.68 - 13.73 |
> | clang + NA | 13.89 - 13.90 | 12.70 - 12.73 |
> | clang + "spectre_v2_user=on" | 29.00 - 29.02 | 18.88 - 18.97 |
So the real benefit is observable when spectre_v2_user mitigations are
enabled. You completely fail to explain that.
> Perf test info:
> 1. kernel source:
> linux-next
> commit 9c0826a5d9aa4d52206d ("Add linux-next specific files for 20251107")
> 2. compiler:
> gcc: gcc version 15.2.0 (Debian 15.2.0-7) with
> GNU ld (GNU Binutils for Debian) 2.45
> clang: Debian clang version 21.1.4 (8) with
> Debian LLD 21.1.4 (compatible with GNU linkers)
> 3. config:
> base on default x86_64_defconfig, and setting:
> CONFIG_HZ=100
> CONFIG_DEBUG_ENTRY=n
> CONFIG_X86_DEBUG_FPU=n
> CONFIG_EXPERT=y
> CONFIG_MODIFY_LDT_SYSCALL=n
> CONFIG_CGROUPS=n
> CONFIG_BLK_DEV_NVME=y
This really can go into the comment section below the first '---'
separator. No point in having this in the change log.
> Size test:
> bzImage size:
> | compiler | without patches | with patches |
> | clang | 13722624 | 13722624 |
> | gcc | 12596224 | 12596224 |
bzImage size is completely irrelevant. What's interesting is how the
size of the actual function changes.
> Size test info:
> 1. kernel source && compiler: same as above
> 2. config:
> base on default x86_64_defconfig, and setting:
> CONFIG_SCHED_CORE=y
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y
> CONFIG_NO_HZ_FULL=y
And again, we all know how to build a kernel.
More information about the linux-riscv
mailing list