[RFC PATCH] riscv: Optimize gcd() performance by selecting CPU_NO_EFFICIENT_FFS

Alexandre Ghiti alex at ghiti.fr
Fri Mar 28 07:07:36 PDT 2025


Hi Kuan-Wei,

First sorry for the late review.

On 17/02/2025 02:37, Kuan-Wei Chiu wrote:
> When the Zbb extension is not supported, ffs() falls back to a software
> implementation instead of leveraging the hardware ctz instruction for
> fast computation. In such cases, selecting CPU_NO_EFFICIENT_FFS
> optimizes the efficiency of gcd().
>
> The implementation of gcd() depends on the CPU_NO_EFFICIENT_FFS option.
> With hardware support for ffs, the binary GCD algorithm is used.
> Without it, the odd-even GCD algorithm is employed for better
> performance.
>
> Co-developed-by: Yu-Chun Lin <eleanor15x at gmail.com>
> Signed-off-by: Yu-Chun Lin <eleanor15x at gmail.com>
> Signed-off-by: Kuan-Wei Chiu <visitorckw at gmail.com>
> ---
> Although selecting NO_EFFICIENT_FFS seems reasonable without ctz
> instructions, this patch hasn't been tested on real hardware. We'd
> greatly appreciate it if someone could help test and provide
> performance numbers!
>
>   arch/riscv/Kconfig | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 7612c52e9b1e..2dd3699ad09b 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -91,6 +91,7 @@ config RISCV
>   	select CLINT_TIMER if RISCV_M_MODE
>   	select CLONE_BACKWARDS
>   	select COMMON_CLK
> +	select CPU_NO_EFFICIENT_FFS if !RISCV_ISA_ZBB
>   	select CPU_PM if CPU_IDLE || HIBERNATION || SUSPEND
>   	select EDAC_SUPPORT
>   	select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)


So your patch is correct. But a kernel built with RISCV_ISA_ZBB does not 
mean the platform supports zbb and in that case, we'd still use the slow 
version of gcd().

Then I would use static keys instead, can you try to come up with a 
patch that does that?

Thanks,

Alex




More information about the linux-riscv mailing list