[PATCH v2 00/11] arm64: debug: remove hook registration, split exception entry
Ada Couprie Diaz
ada.coupriediaz at arm.com
Wed May 28 03:38:06 PDT 2025
On 16/05/2025 12:57, Luis Claudio R. Goncalves wrote:
> On Tue, May 13, 2025 at 04:19:26PM +0100, Ada Couprie Diaz wrote:
>> Re-sending with proper text format, apologies for the noise...
>>
>> On 13/05/2025 13:25, Luis Claudio R. Goncalves wrote:
>>
>>> On Mon, May 12, 2025 at 06:43:15PM +0100, Ada Couprie Diaz wrote:
>>>> [...]
>>>>
>>>> Single Step Exception
>>>> ===
>>>>
>> Hi Luis,
>>
>> Thanks for taking the time to test, I'm glad it seems OK for now.
>>> Is there any specific test you would like me to run on that test setup I
>>> have?
>> There are a couple of edge-cases that might be problematic if my conclusions
>> are wrong : 1. Race between a step exception being taken, and the related
>> hardware breakpoint/watchpoint being removed 2. Migration of a task stepping
>> a CPU-bound breakpoint/watchpoint
>> [...]
> I ran the two tests you listed above, along with some variations just to
> make sure I got the details right, and all those tests completed flawlessly
> on both machines, on the 4 kernel configurations I tests (all with
> PREEMPT_RT enabled, with and without LOCKDEP and assorted debug features).
Thanks a lot for taking the time to test so exhaustively ! I'm happy to
hear that this part is holding up : I am confident it should be OK.
>>>> Testing examples
>>>> ===
>>>> [...]
>>>>
>>>> GDB commands (for EL0):
>>>> ~~~
>>>> [...]
> This is the only test where I (consistently) hit backtraces. If I run the
> test with "gdb -x ${COMMAND_LIST_FILE} ..." I get a single backtrace, every
> time:
>
> [ 263.890424] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> [ 263.890444] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 5744, name: gdb_prog1
> [ 263.890445] preempt_count: 1, expected: 0
> [ 263.890446] RCU nest depth: 0, expected: 0
> [ 263.890447] 1 lock held by gdb_prog1/5744:
> [ 263.890448] #0: ffff100028496f58 (&sighand->siglock){+.+.}-{3:3}, at: force_sig_info_to_task+0x30/0x150
> [ 263.890468] Preemption disabled at:
> [ 263.890469] [<ffff8000800391a8>] debug_exception_enter+0x18/0x78
> [ 263.890484] CPU: 114 UID: 0 PID: 5744 Comm: gdb_prog1 Tainted: G W 6.15.0-rc6-rt1__dbg #2 PREEMPT_{RT,(lazy)}
> [ 263.890487] Tainted: [W]=WARN
> [ 263.890488] Hardware name: Supermicro ARS-221GL-NR-01/G1SMH, BIOS 2.0 07/12/2024
> [ 263.890490] Call trace:
> [ 263.890492] show_stack+0x30/0x88 (C)
> [ 263.890495] dump_stack_lvl+0xa0/0xe0
> [ 263.890498] dump_stack+0x14/0x2c
> [ 263.890499] __might_resched+0x170/0x240
> [ 263.890506] rt_spin_lock+0x6c/0x1a0
> [ 263.890512] force_sig_info_to_task+0x30/0x150
> [ 263.890513] force_sig_fault+0x68/0xa0
> [ 263.890515] arm64_force_sig_fault+0x44/0x80
> [ 263.890518] send_user_sigtrap+0x60/0xa8
> [ 263.890520] do_brk64+0x40/0x88
> [ 263.890522] el0_brk64+0x50/0x1c0
> [ 263.890526] el0t_64_sync_handler+0x60/0xe0
> [ 263.890528] el0t_64_sync+0x184/0x188
>
> Quite similar to the problem originally reported, where sending signals
> with preemption disabled could trigger the "rtlock_might_resched();" check
> if CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
Oh, indeed : I can confirm that this happens both with my series and on
mainline tags v6.15-rc6, v6.15.
I didn't see it originally, but as you point out it shows up
consistently with CONFIG_DEBUG_ATOMIC_SLEEP enabled.
> If I call gdb and run manually the sequence of commands you described, I
> get the backtrace above three times. The only difference is that on the
> second backtrace I get these extra elements on the header:
>
> [48052.129422] RCU nest depth: 1, expected: 1
> [48052.129424] 2 locks held by gdb_prog1/27451:
> [48052.129425] #0: ffff8000828315c8 (rcu_read_lock){....}-{1:3}, at: breakpoint_handler+0xd8/0x318
> [48052.129439] #1: ffff00008abd92d8 (&sighand->siglock){+.+.}-{3:3}, at: force_sig_info_to_task+0x30/0x150
>
> So, when I enter manually the GDB command you suggested, the result is:
>
> start <--- Backtrace#1: preempt_count: 1
> hbreak 3
> watch target
> commands 2
> continue
> end
> commands 3
> continue
> end
> continue <--- Backtrace#2: preempt_count: 1 RCU nest depth: 1
> jump 11 <--- Backtrace#3: preempt_count: 1
> continue
> quit
>
> I hope this report is helpful.
Very much so, thanks !
I am looking into fixing this in v3, I feel this series is a good
opportunity to do it.
> IMHO, even with these backtraces, there was a considerable enhancement when
> compared to the original scenario we reported.
>
> Best regards,
> Luis
I'm glad that the fix works well under more heavy testing.
Best regards,
Ada
More information about the linux-arm-kernel
mailing list