[PATCH v3 2/2] arm64: efi: Account for the EFI runtime stack in stack unwinder

Nathan Chancellor nathan at kernel.org
Tue Jan 10 12:48:02 PST 2023


Hi Ard,

On Fri, Jan 06, 2023 at 06:47:03PM +0100, Ard Biesheuvel wrote:
> The EFI runtime services run from a dedicated stack now, and so the
> stack unwinder needs to be informed about this.
> 
> Acked-by: Mark Rutland <mark.rutland at arm.com>
> Signed-off-by: Ard Biesheuvel <ardb at kernel.org>

Apologies if this has been reported and/or fixed already, I searched
lore and did not find anything but I just bisected a QEMU boot hang [1]
that we see in the ClangBuiltLinux CI with Fedora's configuration [2] to
this change in next-20220110 as commit a7334dc70496 ("arm64: efi:
Account for the EFI runtime stack in stack unwinder").

I do not think our QEMU command is very exotic but we are not booting
via EFI (although I do see it even if we do boot via EFI):

$ qemu-system-aarch64 \
-cpu max,pauth-impdef=true \
-machine virt,gic-version=max,virtualization=true \
-kernel arch/arm64/boot/Image.gz
-append "console=ttyAMA0 earlycon" \
-display none \
-initrd images/arm64/rootfs.cpio
-m 512m \
-nodefaults \
-no-reboot \
-serial mon:stdio
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x000f0510]
[    0.000000] Linux version 6.2.0-rc2+ (tuxmake at tuxmake) (aarch64-linux-gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39) #1 SMP PREEMPT_DYNAMIC @1673275520
[    0.000000] random: crng init done
[    0.000000] Machine model: linux,dummy-virt
[    0.000000] efi: UEFI not found.
[    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
[    0.000000] printk: bootconsole [pl11] enabled
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x5fe9e3c0-0x5feb4fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000]   Device   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] cma: Reserved 64 MiB at 0x000000005b600000
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv1.1 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: Trusted OS migration not required
[    0.000000] psci: SMC Calling Convention v1.0
[    0.000000] percpu: Embedded 32 pages/cpu s90408 r8192 d32472 u131072
[    0.000000] Detected PIPT I-cache on CPU0
[    0.000000] CPU features: detected: Address authentication (IMP DEF algorithm)
[    0.000000] CPU features: detected: GIC system register CPU interface
[    0.000000] CPU features: detected: Virtualization Host Extensions
[    0.000000] CPU features: detected: Hardware dirty bit management
[    0.000000] CPU features: detected: Spectre-v4
[    0.000000] alternatives: applying boot alternatives
[    0.000000] Fallback order for Node 0: 0
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 129024
[    0.000000] Policy zone: DMA
[    0.000000] Kernel command line: console=ttyAMA0 earlycon
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[    0.000000] Memory: 384052K/524288K available (17088K kernel code, 4208K rwdata, 15032K rodata, 9536K init, 10717K bss, 74700K reserved, 65536K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] ftrace: allocating 58614 entries in 229 pages
[    0.000000] ftrace: allocated 229 pages with 5 groups
[    0.000000] trace event string verifier disabled
[    0.000000] Dynamic Preempt: voluntary
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=4096 to nr_cpu_ids=1.
[    0.000000]  Trampoline variant of Tasks RCU enabled.
[    0.000000]  Rude variant of Tasks RCU enabled.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] GICv3: GIC: Using split EOI/Deactivate mode
[    0.000000] GICv3: 256 SPIs implemented
[    0.000000] GICv3: 0 Extended SPIs implemented
[    0.000000] Root IRQ handler: gic_handle_irq
[    0.000000] GICv3: GICv3 features: 16 PPIs
[    0.000000] GICv3: GICv4 features:
[    0.000000] GICv3: CPU0: found redistributor 0 region 0:0x00000000080a0000
[    0.000000] ITS [mem 0x08080000-0x0809ffff]
[    0.000000] ITS at 0x0000000008080000: Single VMOVP capable
[    0.000000] ITS at 0x0000000008080000: allocated 8192 Devices @43de0000 (indirect, esz 8, psz 64K, shr 1)
[    0.000000] ITS at 0x0000000008080000: allocated 8192 Interrupt Collections @43df0000 (flat, esz 8, psz 64K, shr 1)
[    0.000000] ITS at 0x0000000008080000: allocated 8192 Virtual CPUs @43e00000 (indirect, esz 8, psz 64K, shr 1)
[    0.000000] GICv3: using LPI property table @0x0000000043e10000
[    0.000000] ITS: Allocated DevID ffff as GICv4 proxy device (2 slots)
[    0.000000] ITS: Enabling GICv4 support
[    0.000000] GICv3: CPU0: using allocated LPI pending table @0x0000000043e20000
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] arch_timer: cp15 timer(s) running at 62.50MHz (phys).
[    0.000000] clocksource: arch_sys_counter: mask: 0x1ffffffffffffff max_cycles: 0x1cd42e208c, max_idle_ns: 881590405314 ns
[    0.000069] sched_clock: 57 bits at 63MHz, resolution 16ns, wraps every 4398046511096ns
[    0.002785] kfence: initialized - using 2097152 bytes for 255 objects at 0x(____ptrval____)-0x(____ptrval____)
[    0.014680] Console: colour dummy device 80x25
[    0.022555] Calibrating delay loop (skipped), value calculated using timer frequency.. 125.00 BogoMIPS (lpj=625000)
[    0.023292] pid_max: default: 32768 minimum: 301
[    0.025346] LSM: initializing lsm=lockdown,capability,yama,integrity,selinux,bpf,landlock
[    0.026087] Yama: becoming mindful.
[    0.027053] SELinux:  Initializing.
[    0.027935] LSM support for eBPF active
[    0.028089] landlock: Up and running.
[    0.030514] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    0.030706] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
qemu-system-aarch64: terminating on signal 15 from pid 4112158 (timeout)

This does not appear to be clang specific, as I can reproduce it with
the same configuration using kernel.org's GCC 12.2.0 (as you'll see
above). I unfortunately do not have time to bisect configurations to see
if there is a more trivial reproducer on top of defconfig, so just this
report.

If there is any additional information I can provide, please let me
know.

Cheers,
Nathan

[1]: https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/3883388450/jobs/6628748170
[2]: https://src.fedoraproject.org/rpms/kernel/raw/rawhide/f/kernel-aarch64-fedora.config

# bad: [435bf71af3a0aa8067f3b87ff9febf68b564dbb6] Add linux-next specific files for 20230110
# good: [1fe4fd6f5cad346e598593af36caeadc4f5d4fa9] Merge tag 'xfs-6.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
git bisect start '435bf71af3a0aa8067f3b87ff9febf68b564dbb6' '1fe4fd6f5cad346e598593af36caeadc4f5d4fa9'
# bad: [57aac56e8af1628ef96055820f88ca547233b310] Merge branch 'drm-next' of git://git.freedesktop.org/git/drm/drm.git
git bisect bad 57aac56e8af1628ef96055820f88ca547233b310
# bad: [68efa45676e8f0b73c896ee4f82996ac396455af] Merge branch 'riscv-soc-for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git
git bisect bad 68efa45676e8f0b73c896ee4f82996ac396455af
# bad: [154e09c58f5dcd853bd54ac8b725132126d0b640] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl.git
git bisect bad 154e09c58f5dcd853bd54ac8b725132126d0b640
# good: [0680aa5d9dee0e17d169c022f29c53ec6ad3f69d] Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git
git bisect good 0680aa5d9dee0e17d169c022f29c53ec6ad3f69d
# good: [76c76819b36d2433e2637c91dbdc4ec345bf7d92] ARM: remove CONFIG_UNUSED_BOARD_FILES
git bisect good 76c76819b36d2433e2637c91dbdc4ec345bf7d92
# bad: [d08fc57162414439a50e36eb5c4554a56eb8e121] Merge branch 'for-next' of git://git.armlinux.org.uk/~rmk/linux-arm.git
git bisect bad d08fc57162414439a50e36eb5c4554a56eb8e121
# bad: [3145d9dfed323d5a82d40670e333de58fb1a9e65] Merge branch 'for-linux-next-fixes' of git://anongit.freedesktop.org/drm/drm-misc
git bisect bad 3145d9dfed323d5a82d40670e333de58fb1a9e65
# good: [3b3d45c3b305edeb93053c05993bd7c58b9ceb94] Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
git bisect good 3b3d45c3b305edeb93053c05993bd7c58b9ceb94
# bad: [33b73800589ae5a93e3b2faf2fbac7c30d6fdd1a] Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git
git bisect bad 33b73800589ae5a93e3b2faf2fbac7c30d6fdd1a
# good: [18bba1843fc7f264f58c9345d00827d082f9c558] efi: rt-wrapper: Add missing include
git bisect good 18bba1843fc7f264f58c9345d00827d082f9c558
# bad: [a7334dc70496bb0cec7f1d3b1f148855951f41c5] arm64: efi: Account for the EFI runtime stack in stack unwinder
git bisect bad a7334dc70496bb0cec7f1d3b1f148855951f41c5
# good: [4a70773264fd7f00b8401b5f9addb72b58678fdf] arm64: efi: Avoid workqueue to check whether EFI runtime is live
git bisect good 4a70773264fd7f00b8401b5f9addb72b58678fdf
# first bad commit: [a7334dc70496bb0cec7f1d3b1f148855951f41c5] arm64: efi: Account for the EFI runtime stack in stack unwinder



More information about the linux-arm-kernel mailing list