[PATCH 2/2] arm64/xor: use EOR3 instructions when available

Catalin Marinas catalin.marinas at arm.com
Mon Dec 13 07:05:53 PST 2021


On Mon, Dec 13, 2021 at 02:33:21PM +0100, Ard Biesheuvel wrote:
> On Mon, 13 Dec 2021 at 14:25, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > On Tue, Nov 09, 2021 at 01:03:36PM +0100, Ard Biesheuvel wrote:
> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index 6f2d3e31fb54..14354acba5b4 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -2034,6 +2034,9 @@ config SYSVIPC_COMPAT
> > >       def_bool y
> > >       depends on COMPAT && SYSVIPC
> > >
> > > +config CC_HAVE_SHA3
> > > +     def_bool $(cc-option, -march=armv8.2-a+sha3)
> >
> > Is it the compiler or the assembler that we need to support this? I
> > think it's sufficient to only check the latter.
> >
> > I'd also move it to the ARMv8.2 section.
> >
> > > +
> > >  menu "Power management options"
> > >
> > >  source "kernel/power/Kconfig"
> > > diff --git a/arch/arm64/lib/xor-neon.c b/arch/arm64/lib/xor-neon.c
> > > index ee4795f3e166..0415cb94c781 100644
> > > --- a/arch/arm64/lib/xor-neon.c
> > > +++ b/arch/arm64/lib/xor-neon.c
> > > @@ -172,6 +172,135 @@ void xor_arm64_neon_5(unsigned long bytes, unsigned long *p1,
> > >  }
> > >  EXPORT_SYMBOL(xor_arm64_neon_5);
> > >
> > > +static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
> > > +{
> > > +     uint64x2_t res;
> > > +
> > > +     asm(".arch      armv8.2-a+sha3          \n"
> > > +         "eor3 %0.16b, %1.16b, %2.16b, %3.16b"
> > > +         : "=w"(res) : "w"(p), "w"(q), "w"(r));
> > > +     return res;
> > > +}
> >
> > The .arch here may confuse the compiler/assembler since it overrides any
> > other .arch. I think this diff on top would do but I haven't extensively
> > tested it. I can fold it in if you give it a try:
> 
> I was going to respin this without the static_call changes, since
> those are not going to land anytime soon,

I thought the generic implementation still works, though not the most
efficient.

> and for this code, it
> doesn't really matter anyway. I'll fold in your diff and test it as
> well.

Sounds fine to me.

-- 
Catalin



More information about the linux-arm-kernel mailing list