[PATCH 0/4] arm64: advertise availability of CRC and crypto instructions

Ard Biesheuvel ard.biesheuvel at linaro.org
Wed Dec 18 15:26:53 EST 2013


On 18 December 2013 20:57, Nicolas Pitre <nicolas.pitre at linaro.org> wrote:
> On Wed, 18 Dec 2013, Ard Biesheuvel wrote:
>
>> On 18 December 2013 15:27, Christopher Covington <cov at codeaurora.org> wrote:
>> >
>> > I do not think that Russell is the source of the confusion. Ard wrote, "The
>> > idea is that a binary built for ARM will have access to the extended
>> > instructions which ARM64 offers to ARM32 binaries running in 32 bit
>> > compatibility mode (such as AES, SHAx etc)." I think s/ARM64/ARMv8/ is
>> > necessary to make the statement correct, and hopefully less confusing.
>> >
>>
>> My apologies for adding to the confusion (or creating it in the first place).
>>
>> However, the bottom line is that, as the 32 bit and 64 bit kernels are
>> both able to support userland processes running in the execution state
>> that has retroactively been dubbed 'AArch32', they should both honor
>> the same contract with AArch32 userland on how to discover CPU
>> capabilities at runtime. I do understand Russell's reservations about
>> allocating 6 of the remaining 10 hwcaps bits, and I am open to
>> suggestions on a better approach.
>
> What is the reason for eating a grand total of 6 bits at once in the
> first place?
>

I wasn't entirely accurate: it's 5 bits not 6 ...

> Are those capabilities really going to be independently integrated?  In
> other words, what is the probability for a vendor to integrate some but
> not the others?  If this probability is low then maybe a smaller set of
> wider-covering bits would be good enough in practice, and then some
> kernel emulation could be added for the odd cases.
>

The capabilities in question are:
* AES
* 64 bit polynomial (carry-less) multiply
* SHA1
* SHA2
* CRC32
and it is up to the implementor to choose the combination. To me,
there are no obviously more likely combinations, but perhaps others
have other ideas?

The nice thing about hwcaps is that it is already integrated into the
ifunc resolution done by the loader, which makes it very easy and
straightforward to offer alternative implementations of library
functions based on CPU capabilities.

As any kind of emulation in the kernel is likely to be slower than an
optimized implementation for a CPU without the feature in question,
trapping SIGILL to infer hwcaps is probably the only viable alternate
approach that does not require a new kernel interface.

Regards,
Ard.



More information about the linux-arm-kernel mailing list