[PATCH] riscv: lib: optimize strlen loop efficiency

Wed Jan 14 19:23:13 PST 2026

On 2026/1/15 10:03, Paul Walmsley wrote:
> On Thu, 18 Dec 2025, Feng Jiang wrote:
> 
>> Optimize the generic strlen implementation by using a pre-decrement
>> pointer. This reduces the loop body from 4 instructions to 3 and
>> eliminates the unconditional jump ('j').
>>
>> Old loop (4 instructions, 2 branches):
>>   1: lbu t0, 0(t1); beqz t0, 2f; addi t1, t1, 1; j 1b
>>
>> New loop (3 instructions, 1 branch):
>>   1: addi t1, t1, 1; lbu t0, 0(t1); bnez t0, 1b
>>
>> This change improves execution efficiency and reduces branch pressure
>> for systems without the Zbb extension.
> 
> Looks reasonable; do you have any benchmarks on hardware that you can 
> share?  Any reason why this patch stands alone and isn't rolled up as part 
> of your "optimize string function" series?

Thanks for the feedback.

This patch predates the rest of the series, which is why it wasn't included
in the 'optimize string function' rollup. At the time, I focused on correctness
testing and observed the improvement through rdcycle instruction counts.

Since the series still needs further refinement and may take a longer time to
complete, I was hoping this standalone optimization could be considered independently.
However, I am also happy to roll it into the series if you prefer.

-- 
With Best Regards,
Feng Jiang