[PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

Mark Brown broonie at kernel.org
Wed Nov 4 12:50:33 EST 2020


On Tue, Nov 03, 2020 at 06:00:32PM +0000, Dave Martin wrote:
> On Tue, Nov 03, 2020 at 03:34:27PM +0100, Ard Biesheuvel wrote:

> > First of all, I don't think it is safe at the moment to use SVE in the
> > kernel, as we don't preserve all state IIRC. My memory is a bit hazy,

> I'm not convinced that it's safe right now.  SVE in the kernel is
> unsupported, partly due to cost and partly due to the lack of a
> compelling use case.

I think at a minimum we'd want to handle the vector length explicitly
for kernel mode SVE, vector length independent code will work most of
the time but at the very least it feels like a landmine waiting to cause
trouble.  If nothing else there's probably going to be cases where it
makes a difference for performance.  Other than that I'm not currently
seeing any issues since we're handling SVE in the same paths we handle
the rest of the FPSIMD stuff.

> I think it would be preferable to see this algo accelerated for NEON
> first, since all AArch64 hardware can benefit from that.

...

> much of a problem.  kernel_neon_begin() may incur a save of the full SVE
> state anyway, so in some ways it would be a good thing if we could
> actually make use of all those registers.

> SVE hardware remains rare, so as a general policy I don't think we
> should accept SVE implementations of any algorithm that does not
> already have a NEON implementation -- unless the contributor can
> explain why nobody with non-SVE hardware is going to care about the
> performance of that algo.

I tend to agree here, my concerns are around the cost of maintaining a
SVE implementation relative to the number of users who'd benefit from it
rather than around the basic idea of using SVE at all.  If we were
seeing substantial performance benefits over an implementation using
NEON, or had some other strong push to use SVE like a really solid
understanding of why SVE is a good fit for the algorithm but NEON isn't,
then it'd be worth finishing up the infrastructure.  The infrastructure
itself doesn't seem fundamentally problematic.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20201104/8786ea52/attachment.sig>


More information about the linux-arm-kernel mailing list