[RFC PATCH] riscv: Optimize gcd() performance by selecting CPU_NO_EFFICIENT_FFS
Alexandre Ghiti
alex at ghiti.fr
Fri Mar 28 07:07:36 PDT 2025
Hi Kuan-Wei,
First sorry for the late review.
On 17/02/2025 02:37, Kuan-Wei Chiu wrote:
> When the Zbb extension is not supported, ffs() falls back to a software
> implementation instead of leveraging the hardware ctz instruction for
> fast computation. In such cases, selecting CPU_NO_EFFICIENT_FFS
> optimizes the efficiency of gcd().
>
> The implementation of gcd() depends on the CPU_NO_EFFICIENT_FFS option.
> With hardware support for ffs, the binary GCD algorithm is used.
> Without it, the odd-even GCD algorithm is employed for better
> performance.
>
> Co-developed-by: Yu-Chun Lin <eleanor15x at gmail.com>
> Signed-off-by: Yu-Chun Lin <eleanor15x at gmail.com>
> Signed-off-by: Kuan-Wei Chiu <visitorckw at gmail.com>
> ---
> Although selecting NO_EFFICIENT_FFS seems reasonable without ctz
> instructions, this patch hasn't been tested on real hardware. We'd
> greatly appreciate it if someone could help test and provide
> performance numbers!
>
> arch/riscv/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 7612c52e9b1e..2dd3699ad09b 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -91,6 +91,7 @@ config RISCV
> select CLINT_TIMER if RISCV_M_MODE
> select CLONE_BACKWARDS
> select COMMON_CLK
> + select CPU_NO_EFFICIENT_FFS if !RISCV_ISA_ZBB
> select CPU_PM if CPU_IDLE || HIBERNATION || SUSPEND
> select EDAC_SUPPORT
> select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)
So your patch is correct. But a kernel built with RISCV_ISA_ZBB does not
mean the platform supports zbb and in that case, we'd still use the slow
version of gcd().
Then I would use static keys instead, can you try to come up with a
patch that does that?
Thanks,
Alex
More information about the linux-riscv
mailing list