[PATCH v3 2/2] arm64: efi: Account for the EFI runtime stack in stack unwinder

Ard Biesheuvel ardb at kernel.org
Wed Jan 11 14:53:03 PST 2023


On Wed, 11 Jan 2023 at 22:18, Nathan Chancellor <nathan at kernel.org> wrote:
>
> Hi Ard,
>
> On Wed, Jan 11, 2023 at 09:45:32AM +0100, Ard Biesheuvel wrote:
> > On Tue, 10 Jan 2023 at 21:48, Nathan Chancellor <nathan at kernel.org> wrote:
> > > On Fri, Jan 06, 2023 at 06:47:03PM +0100, Ard Biesheuvel wrote:
> > > > The EFI runtime services run from a dedicated stack now, and so the
> > > > stack unwinder needs to be informed about this.
> > > >
> > > > Acked-by: Mark Rutland <mark.rutland at arm.com>
> > > > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > >
> > > Apologies if this has been reported and/or fixed already, I searched
> > > lore and did not find anything but I just bisected a QEMU boot hang [1]
> > > that we see in the ClangBuiltLinux CI with Fedora's configuration [2] to
> > > this change in next-20220110 as commit a7334dc70496 ("arm64: efi:
> > > Account for the EFI runtime stack in stack unwinder").
> > >
> >
> > Thanks for the report. This is due to an oversight on my part: we
> > removed a spin_is_locked() check, and the lock in question can only be
> > in the locked state when EFI runtime services are enabled to begin
> > with.
> >
> > Without the lock check, we may end up dereferencing the uninitialized
> > efi_rt_stack_top on non-EFI boots.
> >
> > I've fixed this up in the EFI fixes tree, so the issue should
> > disappear once -next is updated. (We just missed 20230111
> > unfortunately)
>
> Thank you for the quick response! That issue appears to be fixed.
>
> Unfortunately, I am still seeing a hang while booting via EFI on either
> bare metal or KVM when CONFIG_DEBUG_PREEMPT is enabled (Fedora's rawhide
> config appears to enable several debugging options), so it appears I was
> seeing two distinct issues :/ defconfig + CONFIG_DEBUG_PREEMPT=y is
> enough for me to reproduce this problem.
>
> I see
>
>   [    0.015382] Remapping and enabling EFI services.
>
> as the last line in the console (via earlycon) with the bad kernel and
> nothing after it (I assume we deadlock somewhere or hit a BUG_ON()?), vs
>
>   [    0.015191] Remapping and enabling EFI services.
>   [    0.016725] smp: Bringing up secondary CPUs ...
>
> on the good kernel, followed by a normal boot.
>

Yeah, this is the same issue, essentially.

I have added back the spin_is_locked() check, which is a better
indicator of whether the EFI runtime stack is actually in use or not.



More information about the linux-arm-kernel mailing list