[PATCH] arm64: Fix early single-stepping

Will Deacon will at kernel.org
Wed Oct 28 04:36:44 EDT 2020


On Wed, Oct 28, 2020 at 09:28:20AM +0100, Jean-Philippe Brucker wrote:
> On Tue, Oct 27, 2020 at 10:49:22PM +0900, Masami Hiramatsu wrote:
> > On Tue, 27 Oct 2020 12:33:18 +0000
> > Will Deacon <will at kernel.org> wrote:
> > 
> > > On Tue, Oct 27, 2020 at 12:59:09PM +0100, Jean-Philippe Brucker wrote:
> > > > On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote:
> > > > > On Tue, 27 Oct 2020 19:13:18 +0900
> > > > > Masami Hiramatsu <mhiramat at kernel.org> wrote:
> > > > > 
> > > > > > On Mon, 26 Oct 2020 18:29:09 +0100
> > > > > > Jean-Philippe Brucker <jean-philippe at linaro.org> wrote:
> > > > > > 
> > > > > > > To use debug features such as single-step, the OS lock must be unlocked
> > > > > > > in the debug registers. Currently this is done in postcore_initcall
> > > > > > > which is now too late.
> > > > > > > 
> > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled
> > > > > > > using kprobes from early_initcall, when OS lock is still locked. So when
> > > > > > > kprobe attempts to single-step a patched instruction, instead of
> > > > > > > trapping, execution continues until it throws an undef exception:
> > > > > > > 
> > > > > > > [    0.064233] Kprobe smoke test: started
> > > > > > > [    0.151133] ------------[ cut here ]------------
> > > > > > > [    0.151458] kernel BUG at arch/arm64/kernel/traps.c:406!
> > > > > > > [    0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > > > > > >               ...
> > > > > > > [    0.162689] Call trace:
> > > > > > > [    0.163014]  do_undefinstr+0x1d4/0x1f4
> > > > > > > [    0.163336]  el1_sync_handler+0xbc/0x140
> > > > > > > [    0.163839]  el1_sync+0x80/0x100
> > > > > > > [    0.164154]  0xffffffc01001d004
> > > > > > > [    0.164527]  init_kprobes+0x13c/0x154
> > > > > > > [    0.164968]  do_one_initcall+0x54/0x2e0
> > > > > > > [    0.165322]  kernel_init_freeable+0xf4/0x258
> > > > > > > [    0.165783]  kernel_init+0x20/0x12c
> > > > > > > [    0.166117]  ret_from_fork+0x10/0x30
> > > > > > > [    0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000)
> > > > > > > [    0.167084] ---[ end trace 36778fdf576e9a79 ]---
> > > > > > > 
> > > > > > > To fix this, unlock the OS lock as early as possible. Do it in
> > > > > > > traps_init() for CPU0, since KGDB wants to use single-step from that
> > > > > > > point on according to commit b322c65f8ca3 ("arm64: Call
> > > > > > > debug_traps_init() from trap_init() to help early kgdb").
> > > > > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall.
> > > > > > > 
> > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> > > > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe at linaro.org>
> > > > > > 
> > > > > > Hi Jean,
> > > > > > 
> > > > > > How have you confirmed this fixes the issue?
> > > > > > On my environment, this doesn't fix the issue.
> > > > > 
> > > > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...)
> > > > > Anyway this works for me too.
> > > > 
> > > > No worries :)  Although now I've been wondering whether it would be better
> > > > to just disable the OS lock lazily, on the first call to
> > > > enable_debug_monitors(). It might add a tiny performance penalty but would
> > > > avoid this problem reappearing if one of the debugger needs to start even
> > > > earlier in the future.
> > > 
> > > I'm still uneasy about enabling KDE with the watchpoint registers in an
> > > unknown state, so I think this needs more work.
> > 
> > Hmm, how we reset it in the early stage? reset watchpoint registers first?
> 
> Yes, I think so. Same order problem as the OS lock, they need to be reset
> before enable_debug_monitors(). On CPU0 that would be before
> early_initcall and for secondaries the hotplug notifier needs to be
> installed earlier as well. I'll send a v2.

Cheers. An alternative (which I think would be better in the long run
anyway) would be to avoid using hardware step in kprobes and instead rely
on a BRK instruction to trap after running the trampoline.

Then we just need to ensure that the BRK handlers are up and running in
time  (which might require an early handler, like we have for KASAN).

Will



More information about the linux-arm-kernel mailing list