[PATCH] arm64: don't preempt_disable in do_debug_exception

Mark Rutland mark.rutland at arm.com
Tue Jun 23 12:55:57 EDT 2020


On Tue, Jun 23, 2020 at 04:59:01PM +0100, Will Deacon wrote:
> On Thu, Jun 18, 2020 at 01:29:29PM -0400, Paul Gortmaker wrote:
> > In commit d8bb6718c4db ("arm64: Make debug exception handlers visible
> > from RCU") debug_exception_enter and exit were added to deal with better
> > tracking of RCU state - however, in addition to that, but not mentioned
> > in the commit log, a preempt_disable/enable pair were also added.
> > 
> > Based on the comment (being removed here) it would seem that the pair
> > were not added to address a specific problem, but just as a proactive,
> > preventative measure - as in "seemed like a good idea at the time".
> > 
> > The problem is that do_debug_exception() eventually calls out to
> > generic kernel code like do_force_sig_info() which takes non-raw locks
> > and on -rt enabled kernels, results in things looking like the following,
> > since on -rt kernels, it is noticed that preemption is still disabled.
> > 
> >  BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:975
> >  in_atomic(): 1, irqs_disabled(): 0, pid: 35658, name: gdbtest
> >  Preemption disabled at:
> >  [<ffff000010081578>] do_debug_exception+0x38/0x1a4
> >  Call trace:
> >  dump_backtrace+0x0/0x138
> >  show_stack+0x24/0x30
> >  dump_stack+0x94/0xbc
> >  ___might_sleep+0x13c/0x168
> >  rt_spin_lock+0x40/0x80
> >  do_force_sig_info+0x30/0xe0
> >  force_sig_fault+0x64/0x90
> >  arm64_force_sig_fault+0x50/0x80
> >  send_user_sigtrap+0x50/0x80
> >  brk_handler+0x98/0xc8
> >  do_debug_exception+0x70/0x1a4
> >  el0_dbg+0x18/0x20
> > 
> > The reproducer was basically an automated gdb test that set a breakpoint
> > on a simple "hello world" program and then quit gdb once the breakpoint
> > was hit - i.e. "(gdb) A debugging session is active.  Quit anyway? "
> 
> Hmm, the debug exception handler path was definitely written with the
> expectation that preemption is disabled, so this is unfortunate. For
> exceptions from kernelspace, we need to keep that guarantee as we implement
> things like BUG() using this path. For exceptions from userspace, it's
> plausible that we could re-enable preemption, but then we should also
> re-enable interrupts and debug exceptions too because we don't
> context-switch pstate in switch_to() and we would end up with holes in our
> kernel debug coverage (and these might be fatal if e.g. single step doesn't
> work in a kprobe OOL buffer). However, that then means that any common code
> when handling user and kernel debug exceptions needs to be re-entrant,
> which it probably isn't at the moment (I haven't checked).

I'm pretty certain existing code is not reentrant, and regardless it's
going to be a mess to reason about this generally if we have to undo our
strict exception nesting rules.

I reckon we need to treat this like an NMI instead -- is that plausible?

Mark.



More information about the linux-arm-kernel mailing list