[PATCH 2/2] arm64/xor: use EOR3 instructions when available

Ard Biesheuvel ardb at kernel.org
Mon Dec 13 07:10:50 PST 2021


On Mon, 13 Dec 2021 at 16:05, Catalin Marinas <catalin.marinas at arm.com> wrote:
>
> On Mon, Dec 13, 2021 at 02:33:21PM +0100, Ard Biesheuvel wrote:
> > On Mon, 13 Dec 2021 at 14:25, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > On Tue, Nov 09, 2021 at 01:03:36PM +0100, Ard Biesheuvel wrote:
> > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > > index 6f2d3e31fb54..14354acba5b4 100644
> > > > --- a/arch/arm64/Kconfig
> > > > +++ b/arch/arm64/Kconfig
> > > > @@ -2034,6 +2034,9 @@ config SYSVIPC_COMPAT
> > > >       def_bool y
> > > >       depends on COMPAT && SYSVIPC
> > > >
> > > > +config CC_HAVE_SHA3
> > > > +     def_bool $(cc-option, -march=armv8.2-a+sha3)
> > >
> > > Is it the compiler or the assembler that we need to support this? I
> > > think it's sufficient to only check the latter.
> > >
> > > I'd also move it to the ARMv8.2 section.
> > >
> > > > +
> > > >  menu "Power management options"
> > > >
> > > >  source "kernel/power/Kconfig"
> > > > diff --git a/arch/arm64/lib/xor-neon.c b/arch/arm64/lib/xor-neon.c
> > > > index ee4795f3e166..0415cb94c781 100644
> > > > --- a/arch/arm64/lib/xor-neon.c
> > > > +++ b/arch/arm64/lib/xor-neon.c
> > > > @@ -172,6 +172,135 @@ void xor_arm64_neon_5(unsigned long bytes, unsigned long *p1,
> > > >  }
> > > >  EXPORT_SYMBOL(xor_arm64_neon_5);
> > > >
> > > > +static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
> > > > +{
> > > > +     uint64x2_t res;
> > > > +
> > > > +     asm(".arch      armv8.2-a+sha3          \n"
> > > > +         "eor3 %0.16b, %1.16b, %2.16b, %3.16b"
> > > > +         : "=w"(res) : "w"(p), "w"(q), "w"(r));
> > > > +     return res;
> > > > +}
> > >
> > > The .arch here may confuse the compiler/assembler since it overrides any
> > > other .arch. I think this diff on top would do but I haven't extensively
> > > tested it. I can fold it in if you give it a try:
> >
> > I was going to respin this without the static_call changes, since
> > those are not going to land anytime soon,
>
> I thought the generic implementation still works, though not the most
> efficient.
>

It does work, but the existing code already uses function pointers, so
at this point, it is just unneeded churn.

> > and for this code, it
> > doesn't really matter anyway. I'll fold in your diff and test it as
> > well.
>
> Sounds fine to me.
>
> --
> Catalin



More information about the linux-arm-kernel mailing list