[PATCH v3 0/5] arm64: vdso: getcpu() support

Mark Brown broonie at kernel.org
Wed Aug 19 08:13:13 EDT 2020


Some applications, especially tracing ones, benefit from avoiding the
syscall overhead for getcpu() so it is common for architectures to have
vDSO implementations. Add one for arm64, using TPIDRRO_EL0 to pass a
pointer to per-CPU data rather than just store the immediate value in
order to allow for future extensibility.

It is questionable if something TPIDRRO_EL0 based is worthwhile at all
on current kernels, since v4.18 we have had support for restartable
sequences which can be used to provide a sched_getcpu() implementation
with generally better performance than the vDSO approach on
architectures which have that[1]. Work is ongoing to implement this for
glibc:

    https://lore.kernel.org/lkml/20200527185130.5604-3-mathieu.desnoyers@efficios.com/

but is not yet merged and will need similar work for other userspaces.
The main advantages for the vDSO implementation are the node parameter
(though this is a static mapping to CPU number so could be looked up
separately when processing data if it's needed, it shouldn't need to be
in the hot path) and ease of implementation for users.

This is currently not compatible with KPTI due to the use of TPIDRRO_EL0
by the KPTI trampoline, this could be addressed by reinitializing that
system register in the return path but I have found it hard to justify
adding that overhead for all users for something that is essentially a
profiling optimization which is likely to get superceeded by a more
modern implementation - if there are other uses for the per-CPU data
then the balance might change here.

This builds on work done by Kristina Martsenko some time ago but is a
new implementation.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d7822b1e24f2df5df98c76f0e94a5416349ff759

v3:
 - Rebase on v5.9-rc1.
 - Drop in progress portions of the series.
v2:
 - Rebase on v5.8-rc3.
 - Add further cleanup patches & a first draft of multi-page support.

Mark Brown (5):
  arm64: vdso: Provide a define when building the vDSO
  arm64: vdso: Add per-CPU data
  arm64: vdso: Initialise the per-CPU vDSO data
  arm64: vdso: Add getcpu() implementation
  selftests: vdso: Support arm64 in getcpu() test

 arch/arm64/include/asm/processor.h            | 12 +----
 arch/arm64/include/asm/vdso/datapage.h        | 54 +++++++++++++++++++
 arch/arm64/kernel/process.c                   | 26 ++++++++-
 arch/arm64/kernel/vdso.c                      | 33 +++++++++++-
 arch/arm64/kernel/vdso/Makefile               |  4 +-
 arch/arm64/kernel/vdso/vdso.lds.S             |  1 +
 arch/arm64/kernel/vdso/vgetcpu.c              | 48 +++++++++++++++++
 .../testing/selftests/vDSO/vdso_test_getcpu.c | 10 ++++
 8 files changed, 172 insertions(+), 16 deletions(-)
 create mode 100644 arch/arm64/include/asm/vdso/datapage.h
 create mode 100644 arch/arm64/kernel/vdso/vgetcpu.c

-- 
2.20.1




More information about the linux-arm-kernel mailing list