[PATCH v4 0/3] Optimize code generation during context switching

Xie Yuanbin qq570070308 at gmail.com
Sun Nov 30 04:43:04 PST 2025


On Sun, 23 Nov 2025 20:18:24 +0800, Xie Yuanbin wrote:
> This series of patches primarily make some functions called in context
> switching as always inline to optimize performance. Here is the
> performance test data for these patches:
> Time spent on calling finish_task_switch(), the unit is tsc from x86:
>  | test scenario             | old   | new   | delta          |
>  | gcc 15.2                  | 13.94 | 12.40 | 1.54  (-11.1%) |
>  | gcc 15.2 + spectre_v2     | 24.78 | 13.70 | 11.08 (-44.7%) |
>  | clang 21.1.4              | 13.90 | 12.71 | 1.19  (- 8.6%) |
>  | clang 21.1.4 + spectre_v2 | 29.01 | 18.91 | 10.1  (-34.8%) |

Hi everyone, I also conducted a performance test on raspberry pi 3b. I
hope this will be helpful in merging the patch.
The following is the test data:
Time spent on calling finish_task_switch(), the clocksource and unit is
cntvct_el0 from aarch64:
 | test scenario             | old  | new  | delta         |
 | gcc 15.2                  | 2.00 | 1.68 | 0.32 (-16.0%) |
 | clang 21.1.6              | 2.15 | 1.68 | 0.47 (-23.5%) |

Since raspberry pi 3b use a cortex-a53 processor, it is not affected by
the spectre v2 vulnerability, as is defined in
arch/arm64/kernel/proton-pack.c:
```c
	static const struct midr_range spectre_v2_safe_list[] = {
		MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
		MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
		MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
		MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
		MIDR_ALL_VERSIONS(MIDR_HISI_TSV110),
		MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_2XX_SILVER),
		MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_3XX_SILVER),
		MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_SILVER),
		{ /* sentinel */ }
	};
```
Link: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/arm64/kernel/proton-pack.c?id=7d31f578f3230f3b7b33b0930b08f9afd8429817#n152

Perhaps I can test the performace with spectre_v2 mitigation enabled on
a raspberry pi 4b in the future.

In order to make the test result stable, I fixed the cpu frequency by
setting config.txt as following:
```config
core_freq_fixed=1
arm_freq=800
arm_freq_min=800
gpu_freq=300
core_freq=300
h264_freq=300
isp_freq=300
v3d_freq=300
hevc_freq=300
sdram_freq=400
gpu_freq_min=300
core_freq_min=300
h264_freq_min=300
isp_freq_min=300
v3d_freq_min=300
hevc_freq_min=300
sdram_freq_min=400
```

The test source is commit 7d31f578f323 ("Add linux-next specific files
for 20251128") from liunx-next branch. Using default defconfig config,
and setting:
CONFIG_ARM64_SVE=n
CONFIG_COMPAT=n
CONFIG_COMPAT_32BIT_TIME=n
CONFIG_ARM64_PTR_AUTH=n
CONFIG_ARM64_GCS=n
CONFIG_ARM64_MTE=n
CONFIG_SHADOW_CALL_STACK=y
CONFIG_SCHED_AUTOGROUP=n
CONFIG_CGROUPS=n
CONFIG_KVM=n
CONFIG_HZ_100=y
CONFIG_HZ=100

Thanks very much!

Xie Yuanbin



More information about the linux-riscv mailing list