[PATCH v7 0/3] arm64/sve: Improve performance when handling SVE access traps

Dave Martin Dave.Martin at arm.com
Wed Jul 21 07:33:54 PDT 2021


On Wed, Mar 03, 2021 at 08:11:14PM +0000, Mark Brown wrote:
> This patch series aims to improve the performance of handling SVE access
> traps, earlier versions were originally written by Julien Gral but based
> on discussions on previous versions the patches have been substantially
> reworked to use a different approach.  The patches are now different
> enough that I set myself as the author, hopefully that's OK for Julien.
> 
> Per the syscall ABI, SVE registers will be unknown after a syscall.  In
> practice, the kernel will disable SVE and the registers will be zeroed
> (except the first 128 bits of each vector) on the next SVE instruction.
> Currently we do this by saving the FPSIMD state to memory, converting to
> the matching SVE state and then reloading the registers on return to
> userspace.  This requires a lot of memory accesses that we shouldn't
> need, improve this by reworking the SVE state tracking so we track if we
> should trap on executing SVE instructions separately to if we need to
> save the full register state.  This allows us to avoid tracking the full
> SVE state until we need to return to userspace and to convert directly
> in registers in the common case where the FPSIMD state is still in
> registers then, reducing overhead in these cases.
> 
> As with current mainline we disable SVE on every syscall.  This may not
> be ideal for applications that mix SVE and syscall usage, strategies
> such as SH's fpu_counter may perform better but we need to assess the
> performance on a wider range of systems than are currently available
> before implementing anything, this rework will make that easier.
> 
> It is also possible to optimize the case when the SVE vector length
> is 128-bit (ie the same size as the FPSIMD vectors).  This could be
> explored in the future, it becomes a lot easier to do with this
> implementation.
> 
> I need to confirm if this still needs an update in KVM to handle
> TIF_SVE_FPSIMD_REGS properly, I'll do that as part of redoing KVM
> testing but that'll take a little while and felt it was important to get
> this out for review now.

Just picking this up:

While I think this was a worthwhile experiment, my concern here is that
while the approach taken in this series is reasonable, it doesn't seem
to reduce the amount of code or result in a net simplification.  From my
side I think it's probably best to stick with what we have, until
someone comes up with something that's clearly easier to understand.

So, I'd still favour the version based on Julien's code, which is more
of an incremental change to what we already had (and I think was most of
the way there in your post recent version of it).

Sorry for sending you down a rabbit-hole!

If the maintainers decide they prefer a new approach at some point
though, I'm not going to argue with that.

Cheers
---Dave

[...]



More information about the linux-arm-kernel mailing list