[PATCH v2 00/11] arm64: debug: remove hook registration, split exception entry
Ada Couprie Diaz
ada.coupriediaz at arm.com
Tue Jun 3 09:10:31 PDT 2025
On 28/05/2025 11:38, Ada Couprie Diaz wrote:
> On 16/05/2025 12:57, Luis Claudio R. Goncalves wrote:
>> On Tue, May 13, 2025 at 04:19:26PM +0100, Ada Couprie Diaz wrote:
>>
>> This is the only test where I (consistently) hit backtraces. If I run
>> the
>> test with "gdb -x ${COMMAND_LIST_FILE} ..." I get a single backtrace,
>> every
>> time:
>>
>> [ 263.890424] BUG: sleeping function called from invalid context at
>> kernel/locking/spinlock_rt.c:48
>> [ 263.890444] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
>> 5744, name: gdb_prog1
>> [ 263.890445] preempt_count: 1, expected: 0
>> [ 263.890446] RCU nest depth: 0, expected: 0
>> [ 263.890447] 1 lock held by gdb_prog1/5744:
>> [ 263.890448] #0: ffff100028496f58 (&sighand->siglock){+.+.}-{3:3},
>> at: force_sig_info_to_task+0x30/0x150
>> [ 263.890468] Preemption disabled at:
>> [ 263.890469] [<ffff8000800391a8>] debug_exception_enter+0x18/0x78
>> [ 263.890484] CPU: 114 UID: 0 PID: 5744 Comm: gdb_prog1 Tainted:
>> G W 6.15.0-rc6-rt1__dbg #2 PREEMPT_{RT,(lazy)}
>> [ 263.890487] Tainted: [W]=WARN
>> [ 263.890488] Hardware name: Supermicro ARS-221GL-NR-01/G1SMH, BIOS
>> 2.0 07/12/2024
>> [ 263.890490] Call trace:
>> [ 263.890492] show_stack+0x30/0x88 (C)
>> [ 263.890495] dump_stack_lvl+0xa0/0xe0
>> [ 263.890498] dump_stack+0x14/0x2c
>> [ 263.890499] __might_resched+0x170/0x240
>> [ 263.890506] rt_spin_lock+0x6c/0x1a0
>> [ 263.890512] force_sig_info_to_task+0x30/0x150
>> [ 263.890513] force_sig_fault+0x68/0xa0
>> [ 263.890515] arm64_force_sig_fault+0x44/0x80
>> [ 263.890518] send_user_sigtrap+0x60/0xa8
>> [ 263.890520] do_brk64+0x40/0x88
>> [ 263.890522] el0_brk64+0x50/0x1c0
>> [ 263.890526] el0t_64_sync_handler+0x60/0xe0
>> [ 263.890528] el0t_64_sync+0x184/0x188
>>
>> Quite similar to the problem originally reported, where sending signals
>> with preemption disabled could trigger the "rtlock_might_resched();"
>> check
>> if CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
>
> Oh, indeed : I can confirm that this happens both with my series and
> on mainline tags v6.15-rc6, v6.15.
>
> I didn't see it originally, but as you point out it shows up
> consistently with CONFIG_DEBUG_ATOMIC_SLEEP enabled.
>
> [...]
>
> I am looking into fixing this in v3, I feel this series is a good
> opportunity to do it.
>
As you mentioned the issue is very similar to the single-step one,
and we already have done most of the work for the fix.
Indeed, the handling of the software breakpoint instruction
from EL0 is very similar to the single step exception.
In fact : the exact same logic applies and we can safely
enable preemption before calling `do_brk64()`, given that the only
possible path from EL0 is the uprobe handler, which sets a TIF
to be handled on exit, as with the single step.
So this is fixed in v3.
However, as the other traces you hit suggested : there is also an issue
with the hardware breakpoint and watchpoint handlers.
Here it is much more painful though.
Indeed : the signal is sent from within the ptrace handler
`ptrace_hbptriggered()`, itself called from the `perf_bp_event()`, called
in a loop in the handlers to find matching {break,watch}points.
This makes it *very* impractical to enable preemption at any point
for those debug exceptions.
Looking into how x86 handles it, they do not seem to have the issue
as they do not send a signal in their hardware breakpoint handler,
rather they set a "virtual register" that should be seen by userspace.
(If my understanding is correct !)
However, they use a `notify_die()` callback to call their hardware
breakpoint handler : `hw_breakpoint_exceptions_notify()`.
We do have it for arm64, but it is a stub.
We might be able to use this callback to delay the signal until we
are done operating on the hardware registers, which definitely needs
preemption disabled.
However, this is becoming a much larger change whose scope I am
not sure of yet, so I will not be looking into fixing it in this series.
Hopefully it can be done at a later date !
Best,
Ada
More information about the linux-arm-kernel
mailing list