[PATCH] arm64: abort SDEI handlers during crash

Fri Feb 10 10:19:44 PST 2023

Hi Scott,

On 04/02/2023 00:08, D Scott Phillips wrote:
> Interrupts are blocked in SDEI context, per the SDEI spec: "The client
> interrupts cannot preempt the event handler."

Firmware is supposed to do this by ensuring PSTATE.I is set when the handler runs.
See "4.3.1 Arm PE Architecture".

Unfortunately trusted-firmware is setting PMR, (which is specific to GIC, not the 'PE
architecture') to a value in the secure range, meaning no normal world interrupt can fire.

This is a bug in trusted firmware, but it looks like its firmly ingrained in their
'exception handling framework', and there is no appetite for fixing it.

So we'll have to hack round it in the kernel.

> If we crashed in the SDEI
> handler-running context (as with ACPI's AGDI) then we need to clean up the
> SDEI state before proceeding to the crash kernel so that the crash kernel
> can have working interrupts.  Try two COMPLETE_AND_RESUMEs in case both a
> normal and critical event were being handled.

There are multiple reasons the kernel might panic(), doing this in crash_smp_send_stop()
is a good call.

> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 11cb99c4d298..03dc233bdaa1 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -909,13 +909,18 @@ NOKPROBE(call_on_irq_stack)
>  #include <asm/sdei.h>
>  #include <uapi/linux/arm_sdei.h>
>  
> -.macro sdei_handler_exit exit_mode
> -	/* On success, this call never returns... */
> +.macro sdei_handler_exit_fallthrough exit_mode
>  	cmp	\exit_mode, #SDEI_EXIT_SMC
>  	b.ne	99f
>  	smc	#0
> -	b	.
> +	b	100f
>  99:	hvc	#0
> +100:
> +.endm

This was a tangled mess before, but now that it can fallthrough we should try harder.
Could you look at smccc_patch_fw_mitigation_conduit for an example of how this can be done
cleanly. That would allow sdei_exit_mode to be removed.
(if that makes sense to do, please do it as a preparatory patch)

> @@ -1077,4 +1082,17 @@ alternative_else_nop_endif

> +SYM_CODE_START(sdei_handler_abort)
> +	mov_q	x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
> +	adr	x1, 1f
> +	ldr_l	x2, sdei_exit_mode
> +	sdei_handler_exit_fallthrough exit_mode=x2
> +	// either fallthrough if not in handler context, or exit the handler
> +	// and jump to the next instruction. Exit will stomp x0-x17, PSTATE,
> +	// ELR_ELx, and SPSR_ELx.
> +1:	ret
> +SYM_CODE_END(sdei_handler_abort)
> +NOKPROBE(sdei_handler_abort)
> +
>  #endif /* CONFIG_ARM_SDE_INTERFACE */

> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index ffc5d76cf695..bc1b3000197e 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -1047,10 +1047,8 @@ void crash_smp_send_stop(void)
>  	 * If this cpu is the only one alive at this point in time, online or
>  	 * not, there are no stop messages to be sent around, so just back out.
>  	 */
> -	if (num_other_online_cpus() == 0) {
> -		sdei_mask_local_cpu();
> -		return;
> -	}
> +	if (num_other_online_cpus() == 0)
> +		goto skip_ipi;
>  
>  	cpumask_copy(&mask, cpu_online_mask);
>  	cpumask_clear_cpu(smp_processor_id(), &mask);
> @@ -1069,7 +1067,15 @@ void crash_smp_send_stop(void)
>  		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
>  			cpumask_pr_args(&mask));
>  
> +skip_ipi:
>  	sdei_mask_local_cpu();
> +	/*
> +	 * The crash may have happened in a critical event handler which
> +	 * preempted a normal handler. So at most we might have two
> +	 * levels of SDEI context to exit.
> +	 */
> +	sdei_handler_abort();
> +	sdei_handler_abort();

And if SDEI wasn't supported? Before SMC-CC you couldn't probe for firmware calls, you had
to know they were implemented. Its entirely possible there are platforms out there that
corrupt more than x0-x17 when you do this.
(also, what happens on machines where the kernel runs at EL2, and there is no EL3?)

You can tell if the kernel is in the 'middle' of an SDEI event based on the stack.

As this is a firmware bug, I'd like to print a warning. Something has gone wrong already,
otherwise we wouldn't be panic()ing, I think its important to identify this extra code is
running - in case it leads to more problems. (and not run it on platforms that aren't
affected).

Could we check PMR_EL1 to see if its got a value outside the kernel's PMR range before
triggering any of this. This would also let you know if you need to do this twice.

Thanks,

James