[RFC/RFT PATCH 0/6] Improve get_random_u8() for use in randomize kstack

Ard Biesheuvel ardb+git at google.com
Thu Nov 27 01:22:27 PST 2025


From: Ard Biesheuvel <ardb at kernel.org>

Ryan reports that get_random_u16() is dominant in the performance
profiling of syscall entry when kstack randomization is enabled [0].

This is the reason many architectures rely on a counter instead, and
that, in turn, is the reason for the convoluted way the (pseudo-)entropy
is gathered and recorded in a per-CPU variable.

Let's try to make the get_random_uXX() fast path faster, and switch to
get_random_u8() so that we'll hit the slow path 2x less often. Then,
wire it up in the syscall entry path, replacing the per-CPU variable,
making the logic at syscall exit redundant.

[0] https://lore.kernel.org/all/dd8c37bc-795f-4c7a-9086-69e584d8ab24@arm.com/

Cc: Kees Cook <kees at kernel.org>
Cc: Ryan Roberts <ryan.roberts at arm.com>
Cc: Will Deacon <will at kernel.org>
Cc: Arnd Bergmann <arnd at arndb.de>
Cc: Jeremy Linton <jeremy.linton at arm.com>
Cc: Catalin Marinas <Catalin.Marinas at arm.com>
Cc: Mark Rutland <mark.rutland at arm.com>
Cc: Jason A. Donenfeld <Jason at zx2c4.com>

Ard Biesheuvel (6):
  hexagon: Wire up cmpxchg64_local() to generic implementation
  arc: Wire up cmpxchg64_local() to generic implementation
  random: Use u32 to keep track of batched entropy generation
  random: Use a lockless fast path for get_random_uXX()
  random: Plug race in preceding patch
  randomize_kstack: Use get_random_u8() at entry for entropy

 arch/Kconfig                       |  9 ++--
 arch/arc/include/asm/cmpxchg.h     |  3 ++
 arch/hexagon/include/asm/cmpxchg.h |  4 ++
 drivers/char/random.c              | 49 ++++++++++++++------
 include/linux/randomize_kstack.h   | 36 ++------------
 init/main.c                        |  1 -
 6 files changed, 49 insertions(+), 53 deletions(-)


base-commit: ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d
-- 
2.52.0.107.ga0afd4fd5b-goog




More information about the linux-arm-kernel mailing list