[PATCH] iommu/arm-smmu-v3: add tracepoints for cmdq_issue_cmdlist

Song Bao Hua (Barry Song) song.bao.hua at hisilicon.com
Fri Aug 28 07:58:35 EDT 2020



> -----Original Message-----
> From: Robin Murphy [mailto:robin.murphy at arm.com]
> Sent: Friday, August 28, 2020 11:18 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>; Will Deacon
> <will at kernel.org>
> Cc: iommu at lists.linux-foundation.org; linux-arm-kernel at lists.infradead.org;
> joro at 8bytes.org; Linuxarm <linuxarm at huawei.com>
> Subject: Re: [PATCH] iommu/arm-smmu-v3: add tracepoints for
> cmdq_issue_cmdlist
> 
> On 2020-08-28 12:02, Song Bao Hua (Barry Song) wrote:
> >
> >
> >> -----Original Message-----
> >> From: Will Deacon [mailto:will at kernel.org]
> >> Sent: Friday, August 28, 2020 10:29 PM
> >> To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>
> >> Cc: iommu at lists.linux-foundation.org;
> linux-arm-kernel at lists.infradead.org;
> >> robin.murphy at arm.com; joro at 8bytes.org; Linuxarm
> <linuxarm at huawei.com>
> >> Subject: Re: [PATCH] iommu/arm-smmu-v3: add tracepoints for
> >> cmdq_issue_cmdlist
> >>
> >> On Thu, Aug 27, 2020 at 09:33:51PM +1200, Barry Song wrote:
> >>> cmdq_issue_cmdlist() is the hotspot that uses a lot of time. This patch
> >>> adds tracepoints for it to help debug.
> >>>
> >>> Signed-off-by: Barry Song <song.bao.hua at hisilicon.com>
> >>> ---
> >>>   * can furthermore develop an eBPF program to benchmark using this
> trace
> >>
> >> Hmm, don't these things have a history of becoming ABI? If so, I don't
> >> really want them in the driver at all, sorry. Do other drivers overcome
> >> this somehow?
> >
> > This kind of tracepoints mainly works as a low-overhead probe point for
> debug purpose. I don't think any
> > application would depend on it. It is for debugging. And there are lots of
> tracepoints in other drivers
> > even in iommu driver core and intel_iommu driver :-)
> >
> > developers use it in one of the below ways:
> >
> > 1. get trace print from the ring buffer by reading debugfs
> > root at ubuntu:/sys/kernel/debug/tracing/events/arm_smmu_v3# echo 1 >
> enable
> > # cat /sys/kernel/debug/tracing/trace_pipe
> > <idle>-0     [058] ..s1 125444.768083: issue_cmdlist_exit:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            <idle>-0     [058] ..s1 125444.768084: issue_cmdlist_entry:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            <idle>-0     [058] ..s1 125444.768085: issue_cmdlist_exit:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            <idle>-0     [058] ..s1 125444.768165: issue_cmdlist_entry:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            <idle>-0     [058] ..s1 125444.768168: issue_cmdlist_exit:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            <idle>-0     [058] ..s1 125444.768169: issue_cmdlist_entry:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            <idle>-0     [058] ..s1 125444.768171: issue_cmdlist_exit:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            <idle>-0     [058] ..s1 125444.768259: issue_cmdlist_entry:
> arm-smmu-v3.2.auto cmd number=1 sync=1
> >            ...
> >
> > This can replace printk with much much lower overhead.
> >
> > 2. add a hook function in tracepoint to do some latency measure and time
> statistics just like the eBPF example
> > I gave after the commit log.
> >
> > Using it, I can get the histogram of the execution time of
> cmdq_issue_cmdlist():
> >     nsecs               : count     distribution
> >           0 -> 1          : 0        |
> |
> >           2 -> 3          : 0        |
> |
> >           4 -> 7          : 0        |
> |
> >           8 -> 15         : 0        |
> |
> >          16 -> 31         : 0        |
> |
> >          32 -> 63         : 0        |
> |
> >          64 -> 127        : 0        |
> |
> >         128 -> 255        : 0        |
> |
> >         256 -> 511        : 0        |
> |
> >         512 -> 1023       : 58       |
> |
> >        1024 -> 2047       : 22763
> |****************************************|
> >        2048 -> 4095       : 13238    |***********************
> |
> >
> > I feel it is very common to do this kind of things for analyzing the
> performance issue. For example, to easy the analysis
> > of softirq latency, softirq.c has the below code:
> >
> > asmlinkage __visible void __softirq_entry __do_softirq(void)
> > {
> > 	...
> > 		trace_softirq_entry(vec_nr);
> > 		h->action(h);
> > 		trace_softirq_exit(vec_nr);
> > 	...
> > }
> 
> If you only want to measure entry and exit of one specific function,
> though, can't the function graph tracer already do that?

Function graph is able to do this specific thing while it is not good to support developers
to use BPF code to do various analysis in various fancy ways. Another disadvanrage of
functiongraph is that it will add the overhead of ftrace of child functions to the parent
function:
a()
{
	b();
	c();
}
b()
{
	d();
}
We have some overhead of ftrace for b(), c(), d(), and all overhead will be added into a(). That can makes
the execution time of a() much longer.

On the other hand, in my original plan, the tracepoints in smmu-v3 driver would not be only in the entry and
exit of this function, it would be in some other places like
before and after the step 1, lock contention
before and after the step 5, wait for the completion of cmd_sync

and some other critical code path which can help analyze the latency of arm-smmu-v3.

I was using the two tracepoints to start the discussion. It happens these two can somehow be implemented
by function graph.
> 
> Otherwise, pursuing optprobes sounds like a worthwhile thing to do since
> that should benefit everyone, rather than just the 6 people on the
> planet who might care about arm_smmu_issue_cmdlist(). As long as it
> doesn't involve whole new ISA extensions like the RISC-V proposal ;)
>

It is a bit sad that only 6 people are caring about the function. Don't know where other
ARM64 server guys go :-)

It seems optprobes/kprobe and tracepoints will work side by side. They are not trying
to replace each other since they both have their own advantages and disadvantages.

If both you and Jean think optprobes is a good direction to go for arm64, I am happy to
start some feasibility study.

> Robin.

Thanks
Barry


More information about the linux-arm-kernel mailing list