[PATCH v2 0/5] xor/arm: Replace vectorized version with intrinsics

Ard Biesheuvel ardb+git at google.com
Tue Mar 31 00:49:40 PDT 2026


From: Ard Biesheuvel <ardb at kernel.org>

Replace the compiler vectorized XOR implementation for ARM with the
existing NEON intrinsics implementation used by arm64. This is slightly
faster, and allows some minor cleanups of the type hacks in the headers
now that intrinsics are the only C code permitted to use FP/SIMD
instructions.

Changes since v1:
- Update kernel_mode_neon.rst to state that arm_neon.h must not be
  included directly, but the new asm/neon-intrinsics.h should be used
  instead
- Avoid #include's of .c files - instead, build arm/xor-neon.c for arm64
  as a separate compilation unit, and export the symbol that is shared
  between the EOR and EOR3 implementations.

Performance (QEMU mach-virt VM running on Synquacer [Cortex-A53 @ 1 GHz]

Before:

[    3.519687] xor: measuring software checksum speed
[    3.521725]    neon            :  1660 MB/sec
[    3.524733]    32regs          :  1105 MB/sec
[    3.527751]    8regs           :  1098 MB/sec
[    3.529911]    arm4regs        :  1540 MB/sec

After:

[    3.517654] xor: measuring software checksum speed
[    3.519454]    neon            :  1896 MB/sec
[    3.522499]    32regs          :  1090 MB/sec
[    3.525560]    8regs           :  1083 MB/sec
[    3.527700]    arm4regs        :  1556 MB/sec

This applies onto Christoph's XOR cleanup series.

Cc: Christoph Hellwig <hch at lst.de>
Cc: Russell King <linux at armlinux.org.uk>
Cc: Arnd Bergmann <arnd at arndb.de>
Cc: Eric Biggers <ebiggers at kernel.org>

Ard Biesheuvel (5):
  ARM: Add a neon-intrinsics.h header like on arm64
  crypto: aegis128 - Use neon-intrinsics.h on ARM too
  xor/arm: Replace vectorized implementation with arm64's intrinsics
  xor/arm64: Use shared NEON intrinsics implementation from 32-bit ARM
  ARM: Remove hacked-up asm/types.h header

 Documentation/arch/arm/kernel_mode_neon.rst |   4 +-
 arch/arm/include/asm/neon-intrinsics.h      |  64 +++++++
 arch/arm/include/uapi/asm/types.h           |  41 -----
 crypto/aegis128-neon-inner.c                |   4 +-
 lib/raid/xor/Makefile                       |   3 +-
 lib/raid/xor/arm/xor-neon.c                 | 187 ++++++++++++++++++--
 lib/raid/xor/arm/xor-neon.h                 |   7 +
 lib/raid/xor/arm/xor_arch.h                 |   7 +-
 lib/raid/xor/arm64/xor-neon.c               | 172 +-----------------
 lib/raid/xor/xor-8regs.c                    |   2 -
 10 files changed, 251 insertions(+), 240 deletions(-)
 create mode 100644 arch/arm/include/asm/neon-intrinsics.h
 delete mode 100644 arch/arm/include/uapi/asm/types.h
 create mode 100644 lib/raid/xor/arm/xor-neon.h

-- 
2.53.0.1018.g2bb0e51243-goog




More information about the linux-arm-kernel mailing list