[PATCH] riscv: lib: optimize strlen loop efficiency

Sun Jan 25 19:05:06 PST 2026

On 2026/1/24 16:14, Paul Walmsley wrote:
> On Thu, 15 Jan 2026, Feng Jiang wrote:
> 
>> On 2026/1/15 10:03, Paul Walmsley wrote:
>>> On Thu, 18 Dec 2025, Feng Jiang wrote:
>>>
>>>> Optimize the generic strlen implementation by using a pre-decrement
>>>> pointer. This reduces the loop body from 4 instructions to 3 and
>>>> eliminates the unconditional jump ('j').
>>>>
>>>> Old loop (4 instructions, 2 branches):
>>>>   1: lbu t0, 0(t1); beqz t0, 2f; addi t1, t1, 1; j 1b
>>>>
>>>> New loop (3 instructions, 1 branch):
>>>>   1: addi t1, t1, 1; lbu t0, 0(t1); bnez t0, 1b
>>>>
>>>> This change improves execution efficiency and reduces branch pressure
>>>> for systems without the Zbb extension.
>>>
>>> Looks reasonable; do you have any benchmarks on hardware that you can 
>>> share?  Any reason why this patch stands alone and isn't rolled up as part 
>>> of your "optimize string function" series?
>>
>> Thanks for the feedback.
>>
>> This patch predates the rest of the series, which is why it wasn't included
>> in the 'optimize string function' rollup. At the time, I focused on correctness
>> testing and observed the improvement through rdcycle instruction counts.
>>
>> Since the series still needs further refinement and may take a longer time to
>> complete, I was hoping this standalone optimization could be considered independently.
> 
> Ok.  Queued for v6.20.
> 
> Might be worth taking a look at David's suggestions for a followup patch?
> 

Thanks for queuing this!

I am definitely planning to study David's suggestions. He has also provided a lot
of valuable feedback on my other patch series, and I will explore further improvements
for a follow-up patch.

-- 
With Best Regards,
Feng Jiang