[PATCH v15 04/10] arm64: Kprobes with single stepping support

Masami Hiramatsu mhiramat at kernel.org
Mon Aug 8 15:19:19 PDT 2016


On Thu, 4 Aug 2016 00:47:27 -0400
David Long <dave.long at linaro.org> wrote:

> On 07/29/2016 05:01 AM, Daniel Thompson wrote:
> > On 28/07/16 15:40, Catalin Marinas wrote:
> >> On Wed, Jul 27, 2016 at 06:13:37PM -0400, David Long wrote:
> >>> On 07/27/2016 07:50 AM, Daniel Thompson wrote:
> >>>> On 25/07/16 23:27, David Long wrote:
> >>>>> On 07/25/2016 01:13 PM, Catalin Marinas wrote:
> >>>>>> The problem is that the original design was done on x86 for its 
> >>>>>> PCS and
> >>>>>> it doesn't always fit other architectures. So we could either 
> >>>>>> ignore the
> >>>>>> problem, hoping that no probed function requires argument passing on
> >>>>>> stack or we copy all the valid data on the kernel stack:
> >>>>>>
> >>>>>> diff --git a/arch/arm64/include/asm/kprobes.h
> >>>>>> b/arch/arm64/include/asm/kprobes.h
> >>>>>> index 61b49150dfa3..157fd0d0aa08 100644
> >>>>>> --- a/arch/arm64/include/asm/kprobes.h
> >>>>>> +++ b/arch/arm64/include/asm/kprobes.h
> >>>>>> @@ -22,7 +22,7 @@
> >>>>>>
> >>>>>>  #define __ARCH_WANT_KPROBES_INSN_SLOT
> >>>>>>  #define MAX_INSN_SIZE            1
> >>>>>> -#define MAX_STACK_SIZE            128
> >>>>>> +#define MAX_STACK_SIZE            THREAD_SIZE
> >>>>>>
> >>>>>>  #define flush_insn_slot(p)        do { } while (0)
> >>>>>>  #define kretprobe_blacklist_size    0
> >>>>>
> >>>>> I doubt the ARM PCS is unusual.  At any rate I'm certain there are 
> >>>>> other
> >>>>> architectures that pass aggregate parameters on the stack. I suspect
> >>>>> other RISC(-ish) architectures have similar PCS issues and I think 
> >>>>> this
> >>>>> is at least a big part of where this simple copy with a 64/128 limit
> >>>>> comes from, or at least why it continues to exist.  That said, I'm not
> >>>>> enthusiastic about researching that assertion in detail as it could be
> >>>>> time consuming.
> >>>>
> >>>> Given Mark shared a test program I *was* curious enough to take a look
> >>>> at this.
> >>>>
> >>>> The only architecture I can find that behaves like arm64 with the
> >>>> implicit pass-by-reference described by Catalin/Mark is sparc64.
> >>>>
> >>>> In contrast alpha, arm (32-bit), hppa64, mips64 and powerpc64 all use a
> >>>> hybrid approach where the first fragments of the structure are 
> >>>> passed in
> >>>> registers and the remainder on the stack.
> >>>
> >>> That's interesting.  It also looks like sparc64 does not copy any 
> >>> stack for
> >>> jprobes. I guess that approach at least makes it clear what will and 
> >>> won't
> >>> work.
> >>
> >> I suggest we do the same for arm64 - avoid the copying entirely as it's
> >> not safe anyway. We don't know how much to copy, nor can we be sure it
> >> is safe (see Dave's DMA to the stack example). This would need to be
> >> documented in the kprobes.txt file and MAX_STACK_SIZE removed from the
> >> arm64 kprobes support.
> >>
> >> There is also the case that Daniel was talking about - passing more than
> >> 8 arguments. I don't think it's worth handling this
> > 
> > Its actually quite hard to document the (architecture specific) "no big 
> > structures" *and* the "8 argument" limits. It ends up as something like:
> > 
> >    Structures/unions >16 bytes must not be passed by value and the
> >    size of all arguments, after padding each to an 8 byte boundary, must
> >    be less than 64 bytes.
> > 
> > We cannot avoid tackling big structures through documentation but when 
> > we impose additional limits like "only 8 arguments" we are swapping an 
> > architecture neutral "gotcha" that affects almost all jprobes uses (and 
> > can be inferred from the documentation) with an architecture specific one!
> > 
> 
> See new patch below.  The documentation change in it could use some scrutiny.
> I've tested with one-off jprobes functions in a test module and I've
> verified NET_TCPPROBE doesn't cause misbehavior.
> 
> > 
> >  > but we should at
> >> least add a warning and skip the probe:
> >>
> >> diff --git a/arch/arm64/kernel/probes/kprobes.c 
> >> b/arch/arm64/kernel/probes/kprobes.c
> >> index bf9768588288..84e02606ec3d 100644
> >> --- a/arch/arm64/kernel/probes/kprobes.c
> >> +++ b/arch/arm64/kernel/probes/kprobes.c
> >> @@ -491,6 +491,10 @@ int __kprobes setjmp_pre_handler(struct kprobe 
> >> *p, struct pt_regs *regs)
> >>      struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> >>      long stack_ptr = kernel_stack_pointer(regs);
> >>
> >> +    /* do not allow arguments passed on the stack */
> >> +    if (WARN_ON_ONCE(regs->sp != regs->regs[29]))
> >> +        return 0;
> >> +
> > 
> > I don't really understand this test.
> > 
> > If we could reliably assume that the frame record was at the lowest 
> > address within a stack frame then we could exploit that to store the 
> > stacked arguments without risking overwriting volatile variables on the 
> > stack.
> > 
> > 
> > Daniel.
> > 
> 
> I'm assuming the consensus is to not use the above snippet of code.
> 
> Thanks,
> -dl
> 
> ----------cut here--------
> 
> 
> From b451caa1adaf1d03e08a44b5dad3fca31cebd97a Mon Sep 17 00:00:00 2001
> From: "David A. Long" <dave.long at linaro.org>
> Date: Thu, 4 Aug 2016 00:35:33 -0400
> Subject: [PATCH] arm64: Remove stack duplicating code from jprobes
> 
> Because the arm64 calling standard allows stacked function arguments to be
> anywhere in the stack frame, do not attempt to duplicate the stack frame for
> jprobes handler functions.
> 
> Signed-off-by: David A. Long <dave.long at linaro.org>

Looks good to me.

Acked-by: Masami Hiramatsu <mhiramat at kernel.org>

Thanks,

> ---
>  Documentation/kprobes.txt          |  7 +++++++
>  arch/arm64/include/asm/kprobes.h   |  2 --
>  arch/arm64/kernel/probes/kprobes.c | 31 +++++--------------------------
>  3 files changed, 12 insertions(+), 28 deletions(-)
> 
> diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
> index 1f9b3e2..bd01839 100644
> --- a/Documentation/kprobes.txt
> +++ b/Documentation/kprobes.txt
> @@ -103,6 +103,13 @@ Note that the probed function's args may be passed on the stack
>  or in registers.  The jprobe will work in either case, so long as the
>  handler's prototype matches that of the probed function.
>  
> +Note that in some architectures (e.g.: arm64) the stack copy is not
> +done, as the actual location of stacked parameters may be outside of
> +a reasonable MAX_STACK_SIZE value and because that location cannot be
> +determined by the jprobes code. In this case the jprobes user must be
> +careful to make certain the calling signature of the function does
> +not cause parameters to be passed on the stack.
> +
>  1.3 Return Probes
>  
>  1.3.1 How Does a Return Probe Work?
> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> index 61b4915..1737aec 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -22,7 +22,6 @@
>  
>  #define __ARCH_WANT_KPROBES_INSN_SLOT
>  #define MAX_INSN_SIZE			1
> -#define MAX_STACK_SIZE			128
>  
>  #define flush_insn_slot(p)		do { } while (0)
>  #define kretprobe_blacklist_size	0
> @@ -47,7 +46,6 @@ struct kprobe_ctlblk {
>  	struct prev_kprobe prev_kprobe;
>  	struct kprobe_step_ctx ss_ctx;
>  	struct pt_regs jprobe_saved_regs;
> -	char jprobes_stack[MAX_STACK_SIZE];
>  };
>  
>  void arch_remove_kprobe(struct kprobe *);
> diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> index bf97685..c6b0f40 100644
> --- a/arch/arm64/kernel/probes/kprobes.c
> +++ b/arch/arm64/kernel/probes/kprobes.c
> @@ -41,18 +41,6 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>  static void __kprobes
>  post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
>  
> -static inline unsigned long min_stack_size(unsigned long addr)
> -{
> -	unsigned long size;
> -
> -	if (on_irq_stack(addr, raw_smp_processor_id()))
> -		size = IRQ_STACK_PTR(raw_smp_processor_id()) - addr;
> -	else
> -		size = (unsigned long)current_thread_info() + THREAD_START_SP - addr;
> -
> -	return min(size, FIELD_SIZEOF(struct kprobe_ctlblk, jprobes_stack));
> -}
> -
>  static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>  {
>  	/* prepare insn slot */
> @@ -489,20 +477,15 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
>  {
>  	struct jprobe *jp = container_of(p, struct jprobe, kp);
>  	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> -	long stack_ptr = kernel_stack_pointer(regs);
>  
>  	kcb->jprobe_saved_regs = *regs;
>  	/*
> -	 * As Linus pointed out, gcc assumes that the callee
> -	 * owns the argument space and could overwrite it, e.g.
> -	 * tailcall optimization. So, to be absolutely safe
> -	 * we also save and restore enough stack bytes to cover
> -	 * the argument area.
> +	 * Since we can't be sure where in the stack frame "stacked"
> +	 * pass-by-value arguments are stored we just don't try to
> +	 * duplicate any of the stack. Do not use jprobes on functions that
> +	 * use more than 64 bytes (after padding each to an 8 byte boundary)
> +	 * of arguments, or pass individual arguments larger than 16 bytes.
>  	 */
> -	kasan_disable_current();
> -	memcpy(kcb->jprobes_stack, (void *)stack_ptr,
> -	       min_stack_size(stack_ptr));
> -	kasan_enable_current();
>  
>  	instruction_pointer_set(regs, (unsigned long) jp->entry);
>  	preempt_disable();
> @@ -554,10 +537,6 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
>  	}
>  	unpause_graph_tracing();
>  	*regs = kcb->jprobe_saved_regs;
> -	kasan_disable_current();
> -	memcpy((void *)stack_addr, kcb->jprobes_stack,
> -	       min_stack_size(stack_addr));
> -	kasan_enable_current();
>  	preempt_enable_no_resched();
>  	return 1;
>  }
> -- 
> 2.5.0
> 


-- 
Masami Hiramatsu <mhiramat at kernel.org>



More information about the linux-arm-kernel mailing list