[PATCH] arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack()

Ada Couprie Diaz ada.coupriediaz at arm.com
Thu Jul 24 10:50:38 PDT 2025


I think this has been clarified offline (?), but answering for the list.

On 22/07/2025 06:31, Ard Biesheuvel wrote:

> (cc Sami)
>
> On Tue, 22 Jul 2025 at 00:18, Will Deacon <will at kernel.org> wrote:
>> On Fri, Jul 18, 2025 at 03:28:14PM +0100, Ada Couprie Diaz wrote:
>>> `cpu_switch_to()` and `call_on_irq_stack()` manipulate SP to change
>>> to different stacks along with the Shadow Call Stack if it is enabled.
>>> Those two stack changes cannot be done atomically and both functions
>>> can be interrupted by SErrors or Debug Exceptions which, though unlikely,
>>> is very much broken : if interrupted, we can end up with mismatched stacks
>>> and Shadow Call Stack leading to clobbered stacks.
>>>
>>> In `cpu_switch_to()`, it can happen when SP_EL0 points to the new task,
>>> but x18 stills points to the old task's SCS. When the interrupt handler
>>> tries to save the task's SCS pointer, it will save the old task
>>> SCS pointer (x18) into the new task struct (pointed to by SP_EL0),
>>> clobbering it.
>>>
> In this case, the 'interrupt handler' is not the entry code, but
> call_on_irq_stack(), right?

That is correct yes.

[...]

> So AIUI, the problem is that both cpu_switch_to() and
> call_on_irq_stack() itself can be interrupted by the latter, resulting
> in the shadow call stack pointer recorded in the task struct to be
> incorrect, either because the task struct pointer is pointing to the
> wrong task, or the shadow stack pointer is pointing to the wrong
> stack.

Exactly : that is what I was trying to fix, which is why the commit message
is focused on this.

However, as per the other thread there's potentially other issues here
that I was not well aware of at the time for which masking DAIF is still
beneficial.

>
> Commit 59b37fe52f49955 changed this code to ensure that the shadow
> stack pointer is not preserved to/restored from the ordinary stack,
> because that turns call_on_irq_stack() into a rather useful gadget but
> use the task struct instead.
>
> What we could do instead is record the caller's SCS pointer on the IRQ
> shadow stack, avoiding the task struct altogether. This should be the
> only occurrence of scs_save that may interrupt cpu_switch_to(), so it
> should address both issues AFAICT.
Will has pulled the DAIF fix as per the other discussion, but I did test 
with
your patch below : it does indeed fix the panics that were reported
with pNMI+SCS as far as I can tell.

Thanks for the details !
Ada

> The only minor issue is that loading the IRQ shadow stack pointer
> before assigning SP is problematic, because when call_on_irq_stack()
> is re-entered over the top of an exception that is taken at that
> point, the clobbering of scs_sp will interfere with the return from
> that exception, and so scs_sp should be left untouched until after SP
> is assigned (which is what prevents call_on_irq_stack() from
> re-entering). This creates a tiny window where we may end up servicing
> a (nested) IRQ running on the IRQ stack but on the task shadow stack,
> but I don't think that is a thing to obsess about tbh. (Sami?) If it
> is, we'll need your DAIF changes to turn that into a proper critical
> section, but I doubt whether that is necessary.
>
>
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -874,12 +874,6 @@ NOKPROBE(ret_from_fork)
>    * Calls func(regs) using this CPU's irq stack and shadow irq stack.
>    */
>   SYM_FUNC_START(call_on_irq_stack)
> -#ifdef CONFIG_SHADOW_CALL_STACK
> -       get_current_task x16
> -       scs_save x16
> -       ldr_this_cpu scs_sp, irq_shadow_call_stack_ptr, x17
> -#endif
> -
>          /* Create a frame record to save our LR and SP (implicit in FP) */
>          stp     x29, x30, [sp, #-16]!
>          mov     x29, sp
> @@ -888,7 +882,23 @@ SYM_FUNC_START(call_on_irq_stack)
>
>          /* Move to the new stack and call the function there */
>          add     sp, x16, #IRQ_STACK_SIZE
> +#ifdef CONFIG_SHADOW_CALL_STACK
> +       mov     x16, scs_sp
> +       ldr_this_cpu scs_sp, irq_shadow_call_stack_ptr, x17
> +#ifdef CONFIG_DYNAMIC_SCS
> +       cbz     scs_sp, 0f
> +#endif
> +       str     x16, [scs_sp], #8
> +0:
> +#endif
>          blr     x1
> +#ifdef CONFIG_SHADOW_CALL_STACK
> +#ifdef CONFIG_DYNAMIC_SCS
> +       cbz     scs_sp, 1f
> +#endif
> +       ldr     scs_sp, [scs_sp, #-8]
> +1:
> +#endif
>
>          /*
>           * Restore the SP from the FP, and restore the FP and LR from the
> @@ -896,7 +906,6 @@ SYM_FUNC_START(call_on_irq_stack)
>           */
>          mov     sp, x29
>          ldp     x29, x30, [sp], #16
> -       scs_load_current
>          ret
>   SYM_FUNC_END(call_on_irq_stack)
>   NOKPROBE(call_on_irq_stack)



More information about the linux-arm-kernel mailing list