Problem with nbcon console and amba-pl011 serial port

Mon Jun 2 20:18:23 PDT 2025

I'm seeing a problem in the panic path of an ARM64 VM on Hyper-V
in the Azure public cloud.  Here's the output from the VM's serial port:

# taskset -c 4 /bin/echo c >/proc/sysrq-trigger

[
** replaying previous printk message **
[   51.616656] sysrq: Trigger a crash
[   51.616689] Kernel panic - not syncing: sysrq triggered crash
[   51.624212] CPU: 4 UID: 0 PID: 2278 Comm: echo Tainted: G            E       6.15.0-rc7-next-20250521 #1 VOLUNTARY 
[   51.630165] Tainted: [E]=UNSIGNED_MODULE
[   51.632331] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024
[   51.638771] Call trace:
[   51.640179]  show_stack+0x20/0x38 (C)
[   51.642488]  dump_stack_lvl+0xc8/0xf8
[   51.644638]  dump_stack+0x18/0x28
[   51.646654]  panic+0x384/0x478
[   51.648371]  sysrq_handle_crash+0x20/0x28
[   51.650815]  __handle_sysrq+0xdc/0x2b8
[   51.653080]  write_sysrq_trigger+0x124/0x240
[   51.655508]  proc_reg_write+0xa4/0x100
[   51.657917]  vfs_write+0xd8/0x3e0
[   51.659836]  ksys_write+0x74/0x110
[   51.661735]  __arm64_sys_write+0x24/0x38
[   51.663967]  invoke_syscall+0x6c/0xf8
[   51.666025]  el0_svc_common.constprop.0+0xc8/0xf0
[   51.668771]  do_el0_svc+0x24/0x38
[   51.670713]  el0_svc+0x40/0x198
[   51.672509]  el0t_64_sync_handler+0xc8/0xd0
[   51.675170]  el0t_64_sync+0x1b0/0x1b8
[   51.677351] SMP: stopping secondary CPUs
[   52.728175] SMP: failed to stop secondary CPUs 2
[   52.731229] Kernel Offset: 0x5706ebce0000 from 0xffff800080000000
[   52.734528] PHYS_OFFSET: 0x0
[   52.736115] CPU features: 0x2000,400007c0,02110ca1,5401faab
[   52.739275] Memory Limit: none
[   52.803615] ---[ end Kernel panic - not syncing: sysrq triggered crash ]---

The problem is the failure to stop secondary CPU 2.  (The CPU # that fails
to stop varies from run-to-run.) It is mostly reproducible, but not always. I
bisected to commit 2eb2608618ce ("serial: amba-pl011: Implement nbcon
console") in the 6.15 kernel. Further custom logging shows the details of
why the problem can happen. Here are the steps:

1. The "echo" command ends up in __handle_sysrq(), which outputs the
"sysrq: Trigger a crash" message using pr_info().  I always ran the "echo"
command on CPU 4 just for consistency in my testing/debugging -- there's
nothing special about CPU 4.

2. The "pr/ttyAMA0" kernel thread handles the outputting of the message.
nbcon_kthread_func() calls nbcon_emit_one() with the "use_atomic" parameter
set to false. nbcon_emit_one() in turn calls nbcon_emit_next_record() with
the console spin lock held and interrupts disabled. nbcon_emit_next_record()
then calls pl011_console_write_thread(). The latter has a "for" loop to output
each character of the message, and my custom logging shows that it is indeed
outputting the string "[   51.616656] sysrq: Trigger a crash".

3. While "pr/ttyAMA0" is doing its thing, __handle_sysrq() calls
sysrq_handle_crash(), which calls panic(). After some preliminaries, panic()
outputs the message "Kernel panic - not syncing: sysrq triggered crash"
using pr_emerg().

4. pr_emerg() has a high logging level, and it effectively steals the console
from the "pr/ttyAMA0" task, which I believe is intentional in the nbcon design.
Down in pl011_console_write_thread(), the "pr/ttyAMA0" task is doing
nbcon_enter_unsafe() and nbcon_exit_unsafe() around each character
that it outputs.  When pr_emerg() steals the console, nbcon_exit_unsafe()
returns 0, so the "for" loop exits. pl011_console_write_thread() then
enters a busy "while" loop waiting to reclaim the console. It's doing this
busy "while" loop with interrupts disabled, and because of the panic,
it never succeeds. Whatever CPU is running "pr/ttyAMA0" is effectively
stuck at this point.

5. Meanwhile panic() continues, calling panic_other_cpus_shutdown(). On
ARM64, other CPUs are stopped by sending them an IPI. Each CPU receives
the IPI and calls the PSCI function to stop itself. But the CPU running
"pr/ttyAMA0" is looping forever with interrupts disabled, so it never
processes the IPI and it never stops. ARM64 doesn't have a true NMI that
can override the looping with interrupts disabled, so there's no way to
stop that CPU.

6. The failure to stop the "pr/ttyAMA0" CPU then causes downstream
problems, such as when loading and running a kdump kernel.

The problem is timing dependent.  In some cases, the "pr/ttyAMA0"
thread is evidently able to get the message out before panic() calls
pr_emerg(). In my case running as a guest on Hyper-V, the pl011 device
in the guest VM is emulated by Hyper-V. Each pl011 register access
traps to the hypervisor, which slows things down and probably makes
the problem more likely. But from what I can see, the underlying
race condition exists regardless.

At this point, I just wanted to surface the problem. I don't have any
specific ideas on how to fix it, as my knowledge of nbcon is limited
to what I've learned in figuring out the failure path.

Toshiyuki -- what are your thoughts how to fix this?

Michael Kelley