[PATCH 4/5] arm64: lib: Use MOPS for memcpy() routines
Catalin Marinas
catalin.marinas at arm.com
Wed Oct 2 08:29:13 PDT 2024
On Mon, Sep 30, 2024 at 05:10:50PM +0100, Kristina Martsenko wrote:
> diff --git a/arch/arm64/lib/memcpy.S b/arch/arm64/lib/memcpy.S
> index 4ab48d49c451..9b99106fb95f 100644
> --- a/arch/arm64/lib/memcpy.S
> +++ b/arch/arm64/lib/memcpy.S
> @@ -57,7 +57,7 @@
> The loop tail is handled by always copying 64 bytes from the end.
> */
>
> -SYM_FUNC_START(__pi_memcpy)
> +SYM_FUNC_START_LOCAL(__pi_memcpy_generic)
> add srcend, src, count
> add dstend, dstin, count
> cmp count, 128
> @@ -238,7 +238,24 @@ L(copy64_from_start):
> stp B_l, B_h, [dstin, 16]
> stp C_l, C_h, [dstin]
> ret
> +SYM_FUNC_END(__pi_memcpy_generic)
> +
> +#ifdef CONFIG_AS_HAS_MOPS
> + .arch_extension mops
> +SYM_FUNC_START(__pi_memcpy)
> +alternative_if_not ARM64_HAS_MOPS
> + b __pi_memcpy_generic
> +alternative_else_nop_endif
I'm fine with patching the branch but I wonder whether, for the time
being, we should use alternative_if instead and the NOP to fall through
the default implementation. The hardware in the field doesn't have
FEAT_MOPS yet and they may see a slight penalty introduced by the
branch, especially for small memcpys. Just guessing, I haven't done any
benchmarks.
--
Catalin
More information about the linux-arm-kernel
mailing list