[PATCH v5] arm64: kprobe: Enable OPTPROBE for arm64
Jianhua Liu
jianhua.ljh at gmail.com
Thu Dec 16 04:18:09 PST 2021
On Wed, Dec 15, 2021 at 7:48 AM Song Bao Hua (Barry Song)
<song.bao.hua at hisilicon.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Jianhua Liu [mailto:jianhua.ljh at gmail.com]
> > Sent: Wednesday, December 15, 2021 4:56 AM
> > To: Masami Hiramatsu <mhiramat at kernel.org>
> > Cc: Will Deacon <will at kernel.org>; liuqi (BA) <liuqi115 at huawei.com>; Catalin
> > Marinas <catalin.marinas at arm.com>; linux-arm-kernel at lists.infradead.org;
> > Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>; Zengtao (B)
> > <prime.zeng at hisilicon.com>; robin.murphy at arm.com; Linuxarm
> > <linuxarm at huawei.com>; linux-kernel at vger.kernel.org
> > Subject: Re: [PATCH v5] arm64: kprobe: Enable OPTPROBE for arm64
> >
> > On Tue, Dec 14, 2021 at 8:27 AM Masami Hiramatsu <mhiramat at kernel.org> wrote:
> > >
> > > On Mon, 13 Dec 2021 18:38:52 +0000
> > > Will Deacon <will at kernel.org> wrote:
> > >
> > > > Hi,
> > > >
> > > > [+Janet Liu]
> > > >
> > > > On Tue, Dec 07, 2021 at 08:40:02PM +0800, Qi Liu wrote:
> > > > > This patch introduce optprobe for ARM64. In optprobe, probed
> > > > > instruction is replaced by a branch instruction to trampoline.
> > > > >
> > > > > Performance of optprobe on Hip08 platform is test using kprobe
> > > > > example module[1] to analyze the latency of a kernel function,
> > > > > and here is the result:
> > > > >
> > > > > common kprobe:
> > > > > [280709.846380] do_empty returned 0 and took 1530 ns to execute
> > > > > [280709.852057] do_empty returned 0 and took 550 ns to execute
> > > > > [280709.857631] do_empty returned 0 and took 440 ns to execute
> > > > > [280709.863215] do_empty returned 0 and took 380 ns to execute
> > > > > [280709.868787] do_empty returned 0 and took 360 ns to execute
> > > > > [280709.874362] do_empty returned 0 and took 340 ns to execute
> > > > > [280709.879936] do_empty returned 0 and took 320 ns to execute
> > > > > [280709.885505] do_empty returned 0 and took 300 ns to execute
> > > > > [280709.891075] do_empty returned 0 and took 280 ns to execute
> > > > > [280709.896646] do_empty returned 0 and took 290 ns to execute
> > > > >
> > > > > optprobe:
> > > > > [ 2965.964572] do_empty returned 0 and took 90 ns to execute
> > > > > [ 2965.969952] do_empty returned 0 and took 80 ns to execute
> > > > > [ 2965.975332] do_empty returned 0 and took 70 ns to execute
> > > > > [ 2965.980714] do_empty returned 0 and took 60 ns to execute
> > > > > [ 2965.986128] do_empty returned 0 and took 80 ns to execute
> > > > > [ 2965.991507] do_empty returned 0 and took 70 ns to execute
> > > > > [ 2965.996884] do_empty returned 0 and took 70 ns to execute
> > > > > [ 2966.002262] do_empty returned 0 and took 80 ns to execute
> > > > > [ 2966.007642] do_empty returned 0 and took 70 ns to execute
> > > > > [ 2966.013020] do_empty returned 0 and took 70 ns to execute
> > > > > [ 2966.018400] do_empty returned 0 and took 70 ns to execute
> > > > >
> > > > > As the result shows, optprobe can greatly reduce the latency. Big
> > > > > latency of common kprobe will significantly impact the real result
> > > > > while doing performance analysis or debugging performance issues
> > > > > in lab, so optprobe is useful in this scenario.
> > > > >
> > > > > Acked-by: Masami Hiramatsu <mhiramat at kernel.org>
> > > > > Signed-off-by: Qi Liu <liuqi115 at huawei.com>
> > > > >
> > > > > Note:
> > > > > As branch instruction in Arm64 has a 128M range limitation, optprobe
> > > > > could only used when offset between probe point and trampoline
> > > > > is less than 128M, otherwise kernel will choose common kprobe
> > > > > automaticly.
> > > > >
> > > > > Limitation caused by branch isn't unique to Arm64, but also to
> > > > > x86/arm/powerpc.
> > > > >
> > > > > In fact, Module PLT has been tried to get rid of limiation, but
> > > > > destination of PLT must be a fixed value, and we need to modify
> > > > > the destination (as each optprobe has its own trampoline).
> > > > >
> > > > > As discussed with Masami[2], we can start with core-kernel point
> > > > > (within 128M) as the first step, like other architectures.
> > > > >
> > > > > [1]
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sa
> > mples/kprobes/kretprobe_example.c
> > > > > [2]
> > https://lore.kernel.org/lkml/20211201105001.5164f98ba783e7207df1229c@kerne
> > l.org/
> > > > > ---
> > > > > arch/arm64/Kconfig | 1 +
> > > > > arch/arm64/include/asm/kprobes.h | 21 ++
> > > > > arch/arm64/kernel/probes/Makefile | 2 +
> > > > > arch/arm64/kernel/probes/opt_arm64.c | 199 ++++++++++++++++++
> > > > > .../arm64/kernel/probes/optprobe_trampoline.S | 97 +++++++++
> > > > > include/linux/kprobes.h | 2 +
> > > > > kernel/kprobes.c | 22 ++
> > > > > 7 files changed, 344 insertions(+)
> > > > > create mode 100644 arch/arm64/kernel/probes/opt_arm64.c
> > > > > create mode 100644 arch/arm64/kernel/probes/optprobe_trampoline.S
> > > >
> > > > I've not looked at these changes in detail, but it looks like there is an
> > > > independent patch from Janet Liu trying to do the same thing:
> > > >
> > > >
> > https://lore.kernel.org/r/1635858706-27320-1-git-send-email-jianhua.ljh@gm
> > ail.com
> > > >
> > >
> > > Thanks for noticing. I missed it.
> > >
> > > > The patch here from Qi Liu looks like it's a bit further along, but it
> > > > would be good for Janet to at least test it out and confirm that it works
> > > > for them.
> > >
> > > Yeah, it's now v5.
> > > But it seems Janet's one also has good points. I would like Janet's sharing
> > > save_all_base_regs macro and the comment about the branch instruction.
> > >
> > > >
> > > > Cheers,
> > > >
> > > > Will
> > > >
> > > > [Kept diff inline for Janet]
> > >
> > > Janet, please feel free to review and test it. It is important that you confirm
> > > this can work with your envionment too.
> > > I will review your KPROBE_ON_FTRACE patch.
> > >
> > I have tested these patch on UNISOC s9863a platform before sending.
> >
> > The test case from:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sa
> > mples/kprobes/
> >
> > And I do the following change for kprobe_example.c before testing:
> > 1. delete function handler_post,
> > kprobe_init does "kp.post_handler = handler_post; --->
> > p.post_handler = NULL;"
> > 2. handler_pre calls dump_stack.
> >
> > Thanks for the review.
>
> Hello, Jianhua. I guess Will and Masami meant you may
> test liuqi's optprobe patch on your hardware and make
> sure it can work. At the same time, Masami will also
> take care of your approach.
>
> Thanks
> Barry
>
Thanks for your explanation. I have left UNISOC, today I have asked
old colleage to help this test.
Thanks
Jianhua
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cleardot.gif
Type: image/gif
Size: 43 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20211216/88cfaab6/attachment.gif>
More information about the linux-arm-kernel
mailing list