[PATCH] arm64: don't preempt_disable in do_debug_exception
Will Deacon
will at kernel.org
Tue Jun 23 11:59:01 EDT 2020
On Thu, Jun 18, 2020 at 01:29:29PM -0400, Paul Gortmaker wrote:
> In commit d8bb6718c4db ("arm64: Make debug exception handlers visible
> from RCU") debug_exception_enter and exit were added to deal with better
> tracking of RCU state - however, in addition to that, but not mentioned
> in the commit log, a preempt_disable/enable pair were also added.
>
> Based on the comment (being removed here) it would seem that the pair
> were not added to address a specific problem, but just as a proactive,
> preventative measure - as in "seemed like a good idea at the time".
>
> The problem is that do_debug_exception() eventually calls out to
> generic kernel code like do_force_sig_info() which takes non-raw locks
> and on -rt enabled kernels, results in things looking like the following,
> since on -rt kernels, it is noticed that preemption is still disabled.
>
> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:975
> in_atomic(): 1, irqs_disabled(): 0, pid: 35658, name: gdbtest
> Preemption disabled at:
> [<ffff000010081578>] do_debug_exception+0x38/0x1a4
> Call trace:
> dump_backtrace+0x0/0x138
> show_stack+0x24/0x30
> dump_stack+0x94/0xbc
> ___might_sleep+0x13c/0x168
> rt_spin_lock+0x40/0x80
> do_force_sig_info+0x30/0xe0
> force_sig_fault+0x64/0x90
> arm64_force_sig_fault+0x50/0x80
> send_user_sigtrap+0x50/0x80
> brk_handler+0x98/0xc8
> do_debug_exception+0x70/0x1a4
> el0_dbg+0x18/0x20
>
> The reproducer was basically an automated gdb test that set a breakpoint
> on a simple "hello world" program and then quit gdb once the breakpoint
> was hit - i.e. "(gdb) A debugging session is active. Quit anyway? "
Hmm, the debug exception handler path was definitely written with the
expectation that preemption is disabled, so this is unfortunate. For
exceptions from kernelspace, we need to keep that guarantee as we implement
things like BUG() using this path. For exceptions from userspace, it's
plausible that we could re-enable preemption, but then we should also
re-enable interrupts and debug exceptions too because we don't
context-switch pstate in switch_to() and we would end up with holes in our
kernel debug coverage (and these might be fatal if e.g. single step doesn't
work in a kprobe OOL buffer). However, that then means that any common code
when handling user and kernel debug exceptions needs to be re-entrant,
which it probably isn't at the moment (I haven't checked).
So although I'm alright with this idea for user debug exceptions, I think
it needs more work.
Will
More information about the linux-arm-kernel
mailing list