[PATCH] randomize_kstack: Remove non-functional per-arch entropy filtering

liuyuntao (F) liuyuntao12 at huawei.com
Wed Jun 19 20:47:58 PDT 2024



On 2024/6/20 5:47, Kees Cook wrote:
> An unintended consequence of commit 9c573cd31343 ("randomize_kstack:
> Improve entropy diffusion") was that the per-architecture entropy size
> filtering reduced how many bits were being added to the mix, rather than
> how many bits were being used during the offsetting. All architectures
> fell back to the existing default of 0x3FF (10 bits), which will consume
> at most 1KiB of stack space. It seems that this is working just fine,
> so let's avoid the confusion and update everything to use the default.
>

My original intent was indeed to do this, but I regret that not being 
more explicit in the commit log..

Additionally, I've tested the stack entropy by applying the following 
patch, the result was `Bits of stack entropy: 7` on arm64, too. It does 
not seem to affect the entropy value, maybe removing it is OK, or there 
may be some nuances of your intentions that I've overlooked.

--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -79,9 +79,7 @@ DECLARE_PER_CPU(u32, kstack_offset);
  #define choose_random_kstack_offset(rand) do {                         \
         if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
                                 &randomize_kstack_offset)) {            \
-               u32 offset = raw_cpu_read(kstack_offset);               \
-               offset = ror32(offset, 5) ^ (rand);                     \
-               raw_cpu_write(kstack_offset, offset);                   \
+               raw_cpu_write(kstack_offset, rand);                     \
         }                                                               \
  } while (0)
  #else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */

> The prior intent of the per-architecture limits were:
> 
>    arm64: capped at 0x1FF (9 bits), 5 bits effective
>    powerpc: uncapped (10 bits), 6 or 7 bits effective
>    riscv: uncapped (10 bits), 6 bits effective
>    x86: capped at 0xFF (8 bits), 5 (x86_64) or 6 (ia32) bits effective
>    s390: capped at 0xFF (8 bits), undocumented effective entropy
> 
> Current discussion has led to just dropping the original per-architecture
> filters. The additional entropy appears to be safe for arm64, x86,
> and s390. Quoting Arnd, "There is no point pretending that 15.75KB is
> somehow safe to use while 15.00KB is not."
> 
> Co-developed-by: Yuntao Liu <liuyuntao12 at huawei.com>
> Signed-off-by: Yuntao Liu <liuyuntao12 at huawei.com>
> Fixes: 9c573cd31343 ("randomize_kstack: Improve entropy diffusion")
> Link: https://lore.kernel.org/r/20240617133721.377540-1-liuyuntao12@huawei.com
> Signed-off-by: Kees Cook <kees at kernel.org>
> ---
> Cc: Arnd Bergmann <arnd at arndb.de>
> Cc: Mark Rutland <mark.rutland at arm.com>
> ---
>   arch/arm64/kernel/syscall.c          | 16 +++++++---------
>   arch/s390/include/asm/entry-common.h |  2 +-
>   arch/x86/include/asm/entry-common.h  | 15 ++++++---------
>   3 files changed, 14 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index ad198262b981..7230f6e20ab8 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -53,17 +53,15 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
>   	syscall_set_return_value(current, regs, 0, ret);
>   
>   	/*
> -	 * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
> -	 * but not enough for arm64 stack utilization comfort. To keep
> -	 * reasonable stack head room, reduce the maximum offset to 9 bits.
> +	 * This value will get limited by KSTACK_OFFSET_MAX(), which is 10
> +	 * bits. The actual entropy will be further reduced by the compiler
> +	 * when applying stack alignment constraints: the AAPCS mandates a
> +	 * 16-byte aligned SP at function boundaries, which will remove the
> +	 * 4 low bits from any entropy chosen here.
>   	 *
> -	 * The actual entropy will be further reduced by the compiler when
> -	 * applying stack alignment constraints: the AAPCS mandates a
> -	 * 16-byte (i.e. 4-bit) aligned SP at function boundaries.
> -	 *
> -	 * The resulting 5 bits of entropy is seen in SP[8:4].
> +	 * The resulting 6 bits of entropy is seen in SP[9:4].
>   	 */
> -	choose_random_kstack_offset(get_random_u16() & 0x1FF);
> +	choose_random_kstack_offset(get_random_u16());
>   }
>   
>   static inline bool has_syscall_work(unsigned long flags)
> diff --git a/arch/s390/include/asm/entry-common.h b/arch/s390/include/asm/entry-common.h
> index 7f5004065e8a..35555c944630 100644
> --- a/arch/s390/include/asm/entry-common.h
> +++ b/arch/s390/include/asm/entry-common.h
> @@ -54,7 +54,7 @@ static __always_inline void arch_exit_to_user_mode(void)
>   static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
>   						  unsigned long ti_work)
>   {
> -	choose_random_kstack_offset(get_tod_clock_fast() & 0xff);
> +	choose_random_kstack_offset(get_tod_clock_fast());
>   }
>   
>   #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
> diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
> index 7e523bb3d2d3..fb2809b20b0a 100644
> --- a/arch/x86/include/asm/entry-common.h
> +++ b/arch/x86/include/asm/entry-common.h
> @@ -73,19 +73,16 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
>   #endif
>   
>   	/*
> -	 * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
> -	 * but not enough for x86 stack utilization comfort. To keep
> -	 * reasonable stack head room, reduce the maximum offset to 8 bits.
> -	 *
> -	 * The actual entropy will be further reduced by the compiler when
> -	 * applying stack alignment constraints (see cc_stack_align4/8 in
> +	 * This value will get limited by KSTACK_OFFSET_MAX(), which is 10
> +	 * bits. The actual entropy will be further reduced by the compiler
> +	 * when applying stack alignment constraints (see cc_stack_align4/8 in
>   	 * arch/x86/Makefile), which will remove the 3 (x86_64) or 2 (ia32)
>   	 * low bits from any entropy chosen here.
>   	 *
> -	 * Therefore, final stack offset entropy will be 5 (x86_64) or
> -	 * 6 (ia32) bits.
> +	 * Therefore, final stack offset entropy will be 7 (x86_64) or
> +	 * 8 (ia32) bits.
>   	 */
> -	choose_random_kstack_offset(rdtsc() & 0xFF);
> +	choose_random_kstack_offset(rdtsc());
>   }
>   #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
>   



More information about the linux-arm-kernel mailing list