[PATCH] arm64: abort SDEI handlers during crash
D Scott Phillips
scott at os.amperecomputing.com
Fri Feb 10 14:16:07 PST 2023
James Morse <james.morse at arm.com> writes:
> Hi Scott,
>
> On 04/02/2023 00:08, D Scott Phillips wrote:
>> Interrupts are blocked in SDEI context, per the SDEI spec: "The client
>> interrupts cannot preempt the event handler."
>
> Firmware is supposed to do this by ensuring PSTATE.I is set when the
> handler runs. See "4.3.1 Arm PE Architecture".
>
> Unfortunately trusted-firmware is setting PMR, (which is specific to
> GIC, not the 'PE architecture') to a value in the secure range,
> meaning no normal world interrupt can fire.
Hi James, in the case I see on my machine, the cause isn't an elevated
PMR, but rather an eleveated RPR. IRQs are blocked because the interrupt
that triggered the SDEI event is still active and hasn't dropped
priority on the core that's running the SDEI event handler. Specifically
see ICC_PMR_EL1 == 0xf8 (all priority bits set), but ICC_RPR_EL1 ==
0x70.
The spec does say that PSTATE.I will be set in the handler context, and
also that you "shouldn't" clear PSTATE.I while in the handler, but I
don't read that section or others in the spec as requiring firmware to
maintain the running priority, priority mask, etc such that just
clearing PSTATE.I gets you back to a working system.
Perhaps I'm reading into the "5.2 Event context" bit that says "The
client interrupts cannot preempt the event handler" more than I should,
but I'm taking that to mean, "don't count on working interrupts until
you end the handler"
In my AGDI case, the irq related to the event is entirely owned by
firmware, but in the case of irqs bound to SDEI events, the spec says
that the firmware will activate the interrupt before calling the event
handler and end the interrupt after. It's not explicit about dropping
priority, but it is explicit that the interrupt is meant to be active
during the sdei handler. So I'm not sure what the firmware is doing with
its AGDI interrupt is totally wrong here.
> This is a bug in trusted firmware, but it looks like its firmly
> ingrained in their 'exception handling framework', and there is no
> appetite for fixing it.
>
> So we'll have to hack round it in the kernel.
>
>
>> If we crashed in the SDEI
>> handler-running context (as with ACPI's AGDI) then we need to clean up the
>> SDEI state before proceeding to the crash kernel so that the crash kernel
>> can have working interrupts. Try two COMPLETE_AND_RESUMEs in case both a
>> normal and critical event were being handled.
>
> There are multiple reasons the kernel might panic(), doing this in crash_smp_send_stop()
> is a good call.
>
>
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 11cb99c4d298..03dc233bdaa1 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -909,13 +909,18 @@ NOKPROBE(call_on_irq_stack)
>> #include <asm/sdei.h>
>> #include <uapi/linux/arm_sdei.h>
>>
>> -.macro sdei_handler_exit exit_mode
>> - /* On success, this call never returns... */
>> +.macro sdei_handler_exit_fallthrough exit_mode
>> cmp \exit_mode, #SDEI_EXIT_SMC
>> b.ne 99f
>> smc #0
>> - b .
>> + b 100f
>> 99: hvc #0
>> +100:
>> +.endm
>
> This was a tangled mess before, but now that it can fallthrough we should try harder.
> Could you look at smccc_patch_fw_mitigation_conduit for an example of how this can be done
> cleanly. That would allow sdei_exit_mode to be removed.
> (if that makes sense to do, please do it as a preparatory patch)
OK, will do.
>
>> @@ -1077,4 +1082,17 @@ alternative_else_nop_endif
>
>> +SYM_CODE_START(sdei_handler_abort)
>> + mov_q x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
>> + adr x1, 1f
>> + ldr_l x2, sdei_exit_mode
>> + sdei_handler_exit_fallthrough exit_mode=x2
>> + // either fallthrough if not in handler context, or exit the handler
>> + // and jump to the next instruction. Exit will stomp x0-x17, PSTATE,
>> + // ELR_ELx, and SPSR_ELx.
>> +1: ret
>> +SYM_CODE_END(sdei_handler_abort)
>> +NOKPROBE(sdei_handler_abort)
>> +
>> #endif /* CONFIG_ARM_SDE_INTERFACE */
>
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index ffc5d76cf695..bc1b3000197e 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -1047,10 +1047,8 @@ void crash_smp_send_stop(void)
>> * If this cpu is the only one alive at this point in time, online or
>> * not, there are no stop messages to be sent around, so just back out.
>> */
>> - if (num_other_online_cpus() == 0) {
>> - sdei_mask_local_cpu();
>> - return;
>> - }
>> + if (num_other_online_cpus() == 0)
>> + goto skip_ipi;
>>
>> cpumask_copy(&mask, cpu_online_mask);
>> cpumask_clear_cpu(smp_processor_id(), &mask);
>> @@ -1069,7 +1067,15 @@ void crash_smp_send_stop(void)
>> pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
>> cpumask_pr_args(&mask));
>>
>> +skip_ipi:
>> sdei_mask_local_cpu();
>> + /*
>> + * The crash may have happened in a critical event handler which
>> + * preempted a normal handler. So at most we might have two
>> + * levels of SDEI context to exit.
>> + */
>> + sdei_handler_abort();
>> + sdei_handler_abort();
>
> And if SDEI wasn't supported? Before SMC-CC you couldn't probe for firmware calls, you had
> to know they were implemented. Its entirely possible there are platforms out there that
> corrupt more than x0-x17 when you do this.
> (also, what happens on machines where the kernel runs at EL2, and there is no EL3?)
>
> You can tell if the kernel is in the 'middle' of an SDEI event based on the stack.
Ah, all good points. The case I was worried about was an SDEI normal
event that itself got preempted by a critical event before it switched
sp to the sdei stack, or after it had switched back but hadn't ended,
but you're right, "just try and see" is a really a non-solution.
How about:
if (sp within sdei_shadow_call_stack_normal ||
sp within sdei_shadow_call_stack_critical)
sdei_handler_abort()
if (sp within sdei_shadow_call_stack_critical &&
(interrupted_regs.sp within sdei_shadow_call_stack_normal ||
interrupted_regs.pc within __sdei_asm_handler))
sdei_handler_abort()
> As this is a firmware bug, I'd like to print a warning. Something has
> gone wrong already, otherwise we wouldn't be panic()ing, I think its
> important to identify this extra code is running - in case it leads to
> more problems. (and not run it on platforms that aren't affected).
Yep, good idea, will do.
> Could we check PMR_EL1 to see if its got a value outside the kernel's
> PMR range before triggering any of this. This would also let you know
> if you need to do this twice.
PMR and RPR? What if the firmware fiddled with interrupts some other
way? Group enables? Maybe that's too paranoid and it's better to just
deal with firmware we've got.
>
> Thanks,
>
> James
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
More information about the linux-arm-kernel
mailing list