[PATCH] arm64: kprobes: Use BRK instead of single-step when executing instructions out-of-line

Masami Hiramatsu mhiramat at kernel.org
Mon Nov 2 08:52:55 EST 2020


On Mon, 2 Nov 2020 11:41:52 +0000
Will Deacon <will at kernel.org> wrote:

> On Thu, Oct 29, 2020 at 06:34:42PM +0100, Jean-Philippe Brucker wrote:
> > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled
> > using kprobes from early_initcall. Unfortunately at this point the
> > hardware debug infrastructure is not operational. The OS lock may still
> > be locked, and the hardware watchpoints may have unknown values when
> > kprobe enables debug monitors to single-step instructions.
> > 
> > Rather than using hardware single-step, append a BRK instruction after
> > the instruction to be executed out-of-line.
> > 
> > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> > Suggested-by: Will Deacon <will at kernel.org>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe at linaro.org>
> > ---
> > 
> > An alternative to fix [1]. I haven't done uprobes yet since they don't
> > suffer the same problem, but could add it to the bottom of my list.
> > Lightly tested with kprobe smoke test and the BPF selftests.
> > Interestingly on Seattle when running the "overhead" BPF selftest, that
> > triggers a kprobe a bunch of times, I see a 20-30% improvement with this
> > patch. I'm guessing it's the cost of touching the debug sysregs?
> > 
> > [1] https://lore.kernel.org/linux-arm-kernel/20201026172907.1468294-1-jean-philippe@linaro.org/
> > ---
> >  arch/arm64/include/asm/brk-imm.h        |  2 +
> >  arch/arm64/include/asm/debug-monitors.h |  1 +
> >  arch/arm64/include/asm/kprobes.h        |  2 +-
> >  arch/arm64/kernel/probes/kprobes.c      | 56 +++++++++----------------
> >  4 files changed, 24 insertions(+), 37 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/brk-imm.h b/arch/arm64/include/asm/brk-imm.h
> > index e3d47b52161d..ec7720dbe2c8 100644
> > --- a/arch/arm64/include/asm/brk-imm.h
> > +++ b/arch/arm64/include/asm/brk-imm.h
> > @@ -10,6 +10,7 @@
> >   * #imm16 values used for BRK instruction generation
> >   * 0x004: for installing kprobes
> >   * 0x005: for installing uprobes
> > + * 0x006: for kprobe software single-step
> >   * Allowed values for kgdb are 0x400 - 0x7ff
> >   * 0x100: for triggering a fault on purpose (reserved)
> >   * 0x400: for dynamic BRK instruction
> > @@ -19,6 +20,7 @@
> >   */
> >  #define KPROBES_BRK_IMM			0x004
> >  #define UPROBES_BRK_IMM			0x005
> > +#define KPROBES_BRK_SS_IMM		0x006
> >  #define FAULT_BRK_IMM			0x100
> >  #define KGDB_DYN_DBG_BRK_IMM		0x400
> >  #define KGDB_COMPILED_DBG_BRK_IMM	0x401
> > diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
> > index 0b298f48f5bf..657c921fd784 100644
> > --- a/arch/arm64/include/asm/debug-monitors.h
> > +++ b/arch/arm64/include/asm/debug-monitors.h
> > @@ -53,6 +53,7 @@
> >  
> >  /* kprobes BRK opcodes with ESR encoding  */
> >  #define BRK64_OPCODE_KPROBES	(AARCH64_BREAK_MON | (KPROBES_BRK_IMM << 5))
> > +#define BRK64_OPCODE_KPROBES_SS	(AARCH64_BREAK_MON | (KPROBES_BRK_SS_IMM << 5))
> >  /* uprobes BRK opcodes with ESR encoding  */
> >  #define BRK64_OPCODE_UPROBES	(AARCH64_BREAK_MON | (UPROBES_BRK_IMM << 5))
> >  
> > diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> > index 97e511d645a2..8699ce30f587 100644
> > --- a/arch/arm64/include/asm/kprobes.h
> > +++ b/arch/arm64/include/asm/kprobes.h
> > @@ -16,7 +16,7 @@
> >  #include <linux/percpu.h>
> >  
> >  #define __ARCH_WANT_KPROBES_INSN_SLOT
> > -#define MAX_INSN_SIZE			1
> > +#define MAX_INSN_SIZE			2
> >  
> >  #define flush_insn_slot(p)		do { } while (0)
> >  #define kretprobe_blacklist_size	0
> > diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> > index deba738142ed..ec1446ceacc9 100644
> > --- a/arch/arm64/kernel/probes/kprobes.c
> > +++ b/arch/arm64/kernel/probes/kprobes.c
> > @@ -36,21 +36,25 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> >  static void __kprobes
> >  post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
> >  
> > -static int __kprobes patch_text(kprobe_opcode_t *addr, u32 opcode)
> > +static int __kprobes patch_text(kprobe_opcode_t *addr, u32 opc1, u32 opc2)
> >  {
> > -	void *addrs[1];
> > -	u32 insns[1];
> > +	void *addrs[2];
> > +	u32 insns[2];
> >  
> >  	addrs[0] = addr;
> > -	insns[0] = opcode;
> > +	insns[0] = opc1;
> > +	if (opc2) {
> > +		addrs[1] = addr + 1;
> > +		insns[1] = opc2;
> > +	}
> >  
> > -	return aarch64_insn_patch_text(addrs, insns, 1);
> > +	return aarch64_insn_patch_text(addrs, insns, opc2 ? 2 : 1);
> >  }
> >  
> >  static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> >  {
> >  	/* prepare insn slot */
> > -	patch_text(p->ainsn.api.insn, p->opcode);
> > +	patch_text(p->ainsn.api.insn, p->opcode, BRK64_OPCODE_KPROBES_SS);
> >  
> >  	flush_icache_range((uintptr_t) (p->ainsn.api.insn),
> >  			   (uintptr_t) (p->ainsn.api.insn) +
> > @@ -128,13 +132,13 @@ void *alloc_insn_page(void)
> >  /* arm kprobe: install breakpoint in text */
> >  void __kprobes arch_arm_kprobe(struct kprobe *p)
> >  {
> > -	patch_text(p->addr, BRK64_OPCODE_KPROBES);
> > +	patch_text(p->addr, BRK64_OPCODE_KPROBES, 0);
> >  }
> >  
> >  /* disarm kprobe: remove breakpoint from text */
> >  void __kprobes arch_disarm_kprobe(struct kprobe *p)
> >  {
> > -	patch_text(p->addr, p->opcode);
> > +	patch_text(p->addr, p->opcode, 0);
> 
> I don't think patch_text() is offering an awful lot to these callers. Why
> don't we just call aarch64_insn_patch_text() directly, which I think will
> make the code easier to read too?

Agreed. I guess it's just because histrically used.

> >  void __kprobes arch_remove_kprobe(struct kprobe *p)
> > @@ -163,20 +167,14 @@ static void __kprobes set_current_kprobe(struct kprobe *p)
> >  }
> >  
> >  /*
> > - * Interrupts need to be disabled before single-step mode is set, and not
> > - * reenabled until after single-step mode ends.
> > - * Without disabling interrupt on local CPU, there is a chance of
> > - * interrupt occurrence in the period of exception return and  start of
> > - * out-of-line single-step, that result in wrongly single stepping
> > - * into the interrupt handler.
> > + * Keep interrupts disabled while executing the instruction out-of-line. Since
> > + * the kprobe state is per-CPU, we can't afford to be migrated.
> >   */
> >  static void __kprobes kprobes_save_local_irqflag(struct kprobe_ctlblk *kcb,
> >  						struct pt_regs *regs)
> >  {
> >  	kcb->saved_irqflag = regs->pstate & DAIF_MASK;
> >  	regs->pstate |= PSR_I_BIT;
> > -	/* Unmask PSTATE.D for enabling software step exceptions. */
> > -	regs->pstate &= ~PSR_D_BIT;
> 
> Can we set the D bit now? Would be one less thing to worry about.

Hmm, I think we just ignore D bit above. If someone (kgdb?) is set D flag
then there may be no reason to prohibit it on the trampoline.
(Of course they must handle this case)

Thank you,

-- 
Masami Hiramatsu <mhiramat at kernel.org>



More information about the linux-arm-kernel mailing list