[PATCH v5 0/2] arm64/sve: First steps towards optimizing syscalls

Mark Brown broonie at kernel.org
Fri Nov 6 14:35:51 EST 2020


This is a first attempt to optimize the syscall path when the user
application uses SVE. The patch series was originally written by Julien
Grall but has been left for a long time, I've updated it to current
kernels and tried to address the pending review feedback that I found
(which was mostly documentation issues).

Per the syscall ABI, SVE registers will be unknown after a syscall. In
practice, the kernel will disable SVE and the registers will be zeroed
(except the first 128-bits of each vector) on the next SVE instruction.
In a workload mixing SVE and syscalls, this will result to 2 entry/exit
to the kernel per syscall as we trap on the first SVE access after the
syscall.  This series aims to avoid the second entry/exit by zeroing the
SVE registers on syscall return with a twist when the task will get
rescheduled.

This implementation will have an impact on application using SVE
only once. SVE will now be turned on until the application terminates
(unless it is disabled via ptrace). Cleverer strategies for choosing
between SVE and FPSIMD context switching are possible (see fpu_counter
for SH in mainline, or [1]), but it is difficult to assess the benefit
right now. We could improve the behaviour in the future as a selection
of mature hardware platforsm emerges that we can benchmark.

It is also possible to optimize the case when the SVE vector-length
is 128-bit (i.e the same size as the FPSIMD vectors). This could be
explored in the future.

If merged this will need a renumbering of TIF_SVE_NEEDS_FLUSH and other
TIF flags due to collision with the addition of TIF_NOTIFY_SIGNAL in
-next, this isn't an issue with the current base. The flag is used in an
immediate argument for an and instruction in entry.S so needs a low
number, I can provide a patch for this I've been testing if needed.

v5:
 - Rebase onto v5.10-rc2.
 - Explicitly support the case where TIF_SVE and TIF_SVE_NEEDS_FLUSH are
   set simultaneously, though this is not currently expected to happen.
 - Extensively revised the documentation for TIF_SVE and
   TIF_SVE_NEEDS_FLUSH to hopefully make things more clear together with
   the above, I hope this addresses the comments on the prior version
   but it really needs fresh eyes to tell if that's actually the case.
 - Make comments in ptrace.c more precise.
 - Remove some redundant checks for system_has_sve().
v4:
 - Rebase onto v5.9-rc2
 - Address review comments from Dave Martin, mostly documentation but
   also some refactorings to ensure we don't check capabilities multiple
   times and the addition of some WARN_ONs to make sure assumptions we
   are making about what TIF_ flags can be set when are true.
v3:
 - Rebased to current kernels.
 - Addressed review comments from v2, mostly around tweaks in the

[1] https://git.sphere.ly/dtc/kernel_moto_falcon/commit/acc207616a91a413a50fdd8847a747c4a7324167

Julien Grall (2):
  arm64/sve: Don't disable SVE on syscalls return
  arm64/sve: Rework SVE trap access to use TIF_SVE_NEEDS_FLUSH

 arch/arm64/include/asm/fpsimd.h      |   2 +
 arch/arm64/include/asm/thread_info.h |   6 +-
 arch/arm64/kernel/entry-fpsimd.S     |   5 +
 arch/arm64/kernel/fpsimd.c           | 155 ++++++++++++++++++++-------
 arch/arm64/kernel/process.c          |   1 +
 arch/arm64/kernel/ptrace.c           |  11 ++
 arch/arm64/kernel/signal.c           |  16 ++-
 arch/arm64/kernel/syscall.c          |  13 +--
 8 files changed, 159 insertions(+), 50 deletions(-)


base-commit: 3cea11cd5e3b00d91caf0b4730194039b45c5891
-- 
2.20.1




More information about the linux-arm-kernel mailing list