[PATCH 0/4] arm64: advertise availability of CRC and crypto instructions
Ard Biesheuvel
ard.biesheuvel at linaro.org
Wed Dec 18 16:57:33 EST 2013
On 18 December 2013 22:18, Nicolas Pitre <nicolas.pitre at linaro.org> wrote:
> On Wed, 18 Dec 2013, Ard Biesheuvel wrote:
>> The capabilities in question are:
>> * AES
>> * 64 bit polynomial (carry-less) multiply
>> * SHA1
>> * SHA2
>> * CRC32
>> and it is up to the implementor to choose the combination. To me,
>> there are no obviously more likely combinations, but perhaps others
>> have other ideas?
>
> In any case, I agree with Russell that this looks a bit excessive to
> have a single bit for individual instructions.
They are not in fact individual instructions:
* AES -> aese, aesd, aesmc, aesimc
* SHA1 -> sha1c, sha1m, sha1m, sha1p, sha1su0, sha1su1
* SHA2 -> sha256h, sha256h2, sha256su0, sha256su1
* CRC32 -> crc32, crc32c
The only feature that covers a single instruction is poly64 multiply.
> The current hwcaps is
> certainly not suitable for that level of granularity without a way to
> extend it.
>
There is a way to extend it. It is called HWCAP2, and it is already in
use by powerpc.
> What does the ARM ARM say about those instructions? Are they
> individually optional?
>
Yes, the features (mostly) are. The register is called ID_ISAR5: the
way the bits map to the features is a bit odd, though. AES, SHA1, SHA2
and CRC can be enabled independently, only (oddly) poly64 multiply
implies AES support as well.
>> The nice thing about hwcaps is that it is already integrated into the
>> ifunc resolution done by the loader, which makes it very easy and
>> straightforward to offer alternative implementations of library
>> functions based on CPU capabilities.
>
> The library may as well implement its own ifunc that tests the
> instruction while trapping SIGILL. On those systems with the supported
> instruction there will be no trap. On those that traps then the
> alternative implementation is going to be much slower anyway.
>
True. And the trap still only occurs at load time. But I think we
agree it is essentially a poor man's hwcaps.
>> As any kind of emulation in the kernel is likely to be slower than an
>> optimized implementation for a CPU without the feature in question,
>> trapping SIGILL to infer hwcaps is probably the only viable alternate
>> approach that does not require a new kernel interface.
>
> True. However the kernel side infrastructure to emulate any instruction
> is already there. So this is just a matter of adding an additional
> entry making the call to the existing libs. At least that would make
> things work in case the user space libs, or some inline assembly in
> application code, is not expecting the lack of hardware support.
>
I agree that this is all feasible. But I still feel that the best
solution is to allocate 5 bits in HWCAP, that leaves us with 5 spares
now and another 32 when we (trivially) enable HWCAP2 for ARM.
Regards,
Ard.
More information about the linux-arm-kernel
mailing list