[RFC PATCH] riscv: Optimize gcd() performance by selecting CPU_NO_EFFICIENT_FFS
Kuan-Wei Chiu
visitorckw at gmail.com
Sun Feb 16 17:37:08 PST 2025
When the Zbb extension is not supported, ffs() falls back to a software
implementation instead of leveraging the hardware ctz instruction for
fast computation. In such cases, selecting CPU_NO_EFFICIENT_FFS
optimizes the efficiency of gcd().
The implementation of gcd() depends on the CPU_NO_EFFICIENT_FFS option.
With hardware support for ffs, the binary GCD algorithm is used.
Without it, the odd-even GCD algorithm is employed for better
performance.
Co-developed-by: Yu-Chun Lin <eleanor15x at gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x at gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw at gmail.com>
---
Although selecting NO_EFFICIENT_FFS seems reasonable without ctz
instructions, this patch hasn't been tested on real hardware. We'd
greatly appreciate it if someone could help test and provide
performance numbers!
arch/riscv/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 7612c52e9b1e..2dd3699ad09b 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -91,6 +91,7 @@ config RISCV
select CLINT_TIMER if RISCV_M_MODE
select CLONE_BACKWARDS
select COMMON_CLK
+ select CPU_NO_EFFICIENT_FFS if !RISCV_ISA_ZBB
select CPU_PM if CPU_IDLE || HIBERNATION || SUSPEND
select EDAC_SUPPORT
select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)
--
2.34.1
More information about the linux-riscv
mailing list