[PATCH v5 0/2] arm64/sve: First steps towards optimizing syscalls
Mark Brown
broonie at kernel.org
Fri Nov 6 14:35:51 EST 2020
This is a first attempt to optimize the syscall path when the user
application uses SVE. The patch series was originally written by Julien
Grall but has been left for a long time, I've updated it to current
kernels and tried to address the pending review feedback that I found
(which was mostly documentation issues).
Per the syscall ABI, SVE registers will be unknown after a syscall. In
practice, the kernel will disable SVE and the registers will be zeroed
(except the first 128-bits of each vector) on the next SVE instruction.
In a workload mixing SVE and syscalls, this will result to 2 entry/exit
to the kernel per syscall as we trap on the first SVE access after the
syscall. This series aims to avoid the second entry/exit by zeroing the
SVE registers on syscall return with a twist when the task will get
rescheduled.
This implementation will have an impact on application using SVE
only once. SVE will now be turned on until the application terminates
(unless it is disabled via ptrace). Cleverer strategies for choosing
between SVE and FPSIMD context switching are possible (see fpu_counter
for SH in mainline, or [1]), but it is difficult to assess the benefit
right now. We could improve the behaviour in the future as a selection
of mature hardware platforsm emerges that we can benchmark.
It is also possible to optimize the case when the SVE vector-length
is 128-bit (i.e the same size as the FPSIMD vectors). This could be
explored in the future.
If merged this will need a renumbering of TIF_SVE_NEEDS_FLUSH and other
TIF flags due to collision with the addition of TIF_NOTIFY_SIGNAL in
-next, this isn't an issue with the current base. The flag is used in an
immediate argument for an and instruction in entry.S so needs a low
number, I can provide a patch for this I've been testing if needed.
v5:
- Rebase onto v5.10-rc2.
- Explicitly support the case where TIF_SVE and TIF_SVE_NEEDS_FLUSH are
set simultaneously, though this is not currently expected to happen.
- Extensively revised the documentation for TIF_SVE and
TIF_SVE_NEEDS_FLUSH to hopefully make things more clear together with
the above, I hope this addresses the comments on the prior version
but it really needs fresh eyes to tell if that's actually the case.
- Make comments in ptrace.c more precise.
- Remove some redundant checks for system_has_sve().
v4:
- Rebase onto v5.9-rc2
- Address review comments from Dave Martin, mostly documentation but
also some refactorings to ensure we don't check capabilities multiple
times and the addition of some WARN_ONs to make sure assumptions we
are making about what TIF_ flags can be set when are true.
v3:
- Rebased to current kernels.
- Addressed review comments from v2, mostly around tweaks in the
[1] https://git.sphere.ly/dtc/kernel_moto_falcon/commit/acc207616a91a413a50fdd8847a747c4a7324167
Julien Grall (2):
arm64/sve: Don't disable SVE on syscalls return
arm64/sve: Rework SVE trap access to use TIF_SVE_NEEDS_FLUSH
arch/arm64/include/asm/fpsimd.h | 2 +
arch/arm64/include/asm/thread_info.h | 6 +-
arch/arm64/kernel/entry-fpsimd.S | 5 +
arch/arm64/kernel/fpsimd.c | 155 ++++++++++++++++++++-------
arch/arm64/kernel/process.c | 1 +
arch/arm64/kernel/ptrace.c | 11 ++
arch/arm64/kernel/signal.c | 16 ++-
arch/arm64/kernel/syscall.c | 13 +--
8 files changed, 159 insertions(+), 50 deletions(-)
base-commit: 3cea11cd5e3b00d91caf0b4730194039b45c5891
--
2.20.1
More information about the linux-arm-kernel
mailing list