[PATCH v2] kselftest/arm64: Exit streaming mode after collecting signal context

Will Deacon will at kernel.org
Thu Jul 20 03:31:13 PDT 2023


Hi Mark,

On Wed, Jul 12, 2023 at 12:02:30PM +0100, Mark Brown wrote:
> When we collect a signal context with one of the SME modes enabled we will
> have enabled that mode behind the compiler and libc's back so they may
> issue some instructions not valid in streaming mode, causing spurious
> failures.
> 
> For the code prior to issuing the BRK to trigger signal handling we need to
> stay in streaming mode if we were already there since that's a part of the
> signal context the caller is trying to collect. Unfortunately this code
> includes a memset() which is likely to be heavily optimised and is likely
> to use FP instructions incompatible with streaming mode. We can avoid this
> happening by open coding the memset(), inserting a volatile assembly
> statement to avoid the compiler recognising what's being done and doing
> something in optimisation. This code is not performance critical so the
> inefficiency should not be an issue.
> 
> After collecting the context we can simply exit streaming mode, avoiding
> these issues. Use a full SMSTOP for safety to prevent any issues appearing
> with ZA.

Thanks for looking at this. Comments inline.

> diff --git a/tools/testing/selftests/arm64/signal/test_signals_utils.h b/tools/testing/selftests/arm64/signal/test_signals_utils.h
> index 222093f51b67..db28409fd44b 100644
> --- a/tools/testing/selftests/arm64/signal/test_signals_utils.h
> +++ b/tools/testing/selftests/arm64/signal/test_signals_utils.h
> @@ -60,13 +60,28 @@ static __always_inline bool get_current_context(struct tdescr *td,
>  						size_t dest_sz)
>  {
>  	static volatile bool seen_already;
> +	int i;
> +	char *uc = (char *)dest_uc;
>  
>  	assert(td && dest_uc);
>  	/* it's a genuine invocation..reinit */
>  	seen_already = 0;
>  	td->live_uc_valid = 0;
>  	td->live_sz = dest_sz;
> -	memset(dest_uc, 0x00, td->live_sz);
> +
> +	/*
> +	 * This is a memset() but we don't want the compiler to
> +	 * optimise it into either instructions or a library call
> +	 * which might be incompatible with streaming mode.
> +	 */
> +	for (i = 0; i < td->live_sz; i++) {
> +		asm volatile("nop"
> +			     : "+m" (*dest_uc)
> +			     :
> +			     : "memory");

I don't think it's save to use "+m" here, since the compiler can assume
that the address is used exactly once in the asm. If a post-indexed
addressing mode is generated, then you can end up with register corruption.

Stepping back, why not use either barrier() or OPTIMIZER_HIDE_VAR()
instead?

The most robust fix would be to write all of the streaming mode code in
asm, but I can appreciate that's a tonne of work for a testcase.

Will



More information about the linux-arm-kernel mailing list