[PATCH 0/5] crc64: Tweak intrinsics code and enable it for ARM

Thu Apr 2 01:52:17 PDT 2026

On Wed, 1 Apr 2026, at 21:59, Eric Biggers wrote:
> On Mon, Mar 30, 2026 at 04:46:31PM +0200, Ard Biesheuvel wrote:
>> Apply some tweaks to the new arm64 crc64 NEON intrinsics code, and wire
>> it up for the 32-bit ARM build. Note that true 32-bit ARM CPUs usually
>> don't implement the prerequisite 64x64 PMULL instructions, but 32-bit
>> kernels are commonly used on 64-bit capable hardware too, which do
>> implement the 32-bit versions of the crypto instructions if they are
>> implemented for the 64-bit ISA (as per the architecture).
>> 
>> Cc: Demian Shulhan <demyansh at gmail.com>
>> Cc: Eric Biggers <ebiggers at kernel.org>
>> 
>> Ard Biesheuvel (5):
>>   lib/crc: arm64: Drop unnecessary chunking logic from crc64
>>   lib/crc: arm64: Use existing macros for kernel-mode FPU cflags
>>   ARM: Add a neon-intrinsics.h header like on arm64
>>   lib/crc: arm64: Simplify intrinsics implementation
>>   lib/crc: arm: Enable arm64's NEON intrinsics implementation of crc64
>
> I think patches 3 and 4 should be swapped, so it's cleanups first (which
> make sense regardless of the 32-bit ARM support) and then the 32-bit ARM
> support.
>

Ok.

> I do think we should be aware that even with the code mostly shared
> using the NEON intrinsics, the 32-bit ARM support (which works only on
> CPUs that support PMULL, i.e. are also 64-bit capable) doesn't come for
> free.  We should expect to deal with occasional issues related to the
> intrinsics with certain compiler versions, compiler flags, etc.
>
> I assume that "32-bit kernels on ARMv8 CPUs" is currently still a big
> enough niche to bother with this, despite that niche getting smaller
> over time.

Running a 32-bit kernel on 64-bit capable hardware is usually done to reduce the RAM footprint, and that problem hasn't gotten any smaller lately. And 20x speedup is rather significant.

>  But as I mentioned I do think we should try to simplify it
> as much as possible, e.g. by supporting little-endian only and avoiding
> #ifdefs based on things like the compiler whenever possible.
>

Sure. The only reason I think this is worth the effort is because the same code can be used on ARM and arm64, so once this is no longer the case, I don't think we should bother.

So it makes sense to apply this reasoning to little endian as well - arm64 supports it so we can support in on ARM too.