[PATCH 4/5] arm64: lib: Use MOPS for memcpy() routines

Kristina Martsenko kristina.martsenko at arm.com
Thu Oct 3 09:46:08 PDT 2024


On 02/10/2024 16:29, Catalin Marinas wrote:
> On Mon, Sep 30, 2024 at 05:10:50PM +0100, Kristina Martsenko wrote:
>> diff --git a/arch/arm64/lib/memcpy.S b/arch/arm64/lib/memcpy.S
>> index 4ab48d49c451..9b99106fb95f 100644
>> --- a/arch/arm64/lib/memcpy.S
>> +++ b/arch/arm64/lib/memcpy.S
>> @@ -57,7 +57,7 @@
>>     The loop tail is handled by always copying 64 bytes from the end.
>>  */
>>  
>> -SYM_FUNC_START(__pi_memcpy)
>> +SYM_FUNC_START_LOCAL(__pi_memcpy_generic)
>>  	add	srcend, src, count
>>  	add	dstend, dstin, count
>>  	cmp	count, 128
>> @@ -238,7 +238,24 @@ L(copy64_from_start):
>>  	stp	B_l, B_h, [dstin, 16]
>>  	stp	C_l, C_h, [dstin]
>>  	ret
>> +SYM_FUNC_END(__pi_memcpy_generic)
>> +
>> +#ifdef CONFIG_AS_HAS_MOPS
>> +	.arch_extension mops
>> +SYM_FUNC_START(__pi_memcpy)
>> +alternative_if_not ARM64_HAS_MOPS
>> +	b	__pi_memcpy_generic
>> +alternative_else_nop_endif
> 
> I'm fine with patching the branch but I wonder whether, for the time
> being, we should use alternative_if instead and the NOP to fall through
> the default implementation. The hardware in the field doesn't have
> FEAT_MOPS yet and they may see a slight penalty introduced by the
> branch, especially for small memcpys. Just guessing, I haven't done any
> benchmarks.

My thinking was that this way it doesn't have to be changed again in the
future. But I'm fine with switching to alternative_if for v2.

Thanks,
Kristina




More information about the linux-arm-kernel mailing list