[PATCH] iommu/arm-smmu-v3: add tracepoints for cmdq_issue_cmdlist

Song Bao Hua (Barry Song) song.bao.hua at hisilicon.com
Fri Aug 28 07:02:04 EDT 2020



> -----Original Message-----
> From: Will Deacon [mailto:will at kernel.org]
> Sent: Friday, August 28, 2020 10:29 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>
> Cc: iommu at lists.linux-foundation.org; linux-arm-kernel at lists.infradead.org;
> robin.murphy at arm.com; joro at 8bytes.org; Linuxarm <linuxarm at huawei.com>
> Subject: Re: [PATCH] iommu/arm-smmu-v3: add tracepoints for
> cmdq_issue_cmdlist
> 
> On Thu, Aug 27, 2020 at 09:33:51PM +1200, Barry Song wrote:
> > cmdq_issue_cmdlist() is the hotspot that uses a lot of time. This patch
> > adds tracepoints for it to help debug.
> >
> > Signed-off-by: Barry Song <song.bao.hua at hisilicon.com>
> > ---
> >  * can furthermore develop an eBPF program to benchmark using this trace
> 
> Hmm, don't these things have a history of becoming ABI? If so, I don't
> really want them in the driver at all, sorry. Do other drivers overcome
> this somehow?

This kind of tracepoints mainly works as a low-overhead probe point for debug purpose. I don't think any
application would depend on it. It is for debugging. And there are lots of tracepoints in other drivers
even in iommu driver core and intel_iommu driver :-)

developers use it in one of the below ways:

1. get trace print from the ring buffer by reading debugfs
root at ubuntu:/sys/kernel/debug/tracing/events/arm_smmu_v3# echo 1 > enable
# cat /sys/kernel/debug/tracing/trace_pipe
<idle>-0     [058] ..s1 125444.768083: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1                                    
          <idle>-0     [058] ..s1 125444.768084: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1                                   
          <idle>-0     [058] ..s1 125444.768085: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1                                    
          <idle>-0     [058] ..s1 125444.768165: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1                                   
          <idle>-0     [058] ..s1 125444.768168: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1                                    
          <idle>-0     [058] ..s1 125444.768169: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1                                   
          <idle>-0     [058] ..s1 125444.768171: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1
          <idle>-0     [058] ..s1 125444.768259: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1                                   
          ...

This can replace printk with much much lower overhead.

2. add a hook function in tracepoint to do some latency measure and time statistics just like the eBPF example
I gave after the commit log.

Using it, I can get the histogram of the execution time of cmdq_issue_cmdlist():
   nsecs               : count     distribution 
         0 -> 1          : 0        |                                        | 
         2 -> 3          : 0        |                                        | 
         4 -> 7          : 0        |                                        | 
         8 -> 15         : 0        |                                        | 
        16 -> 31         : 0        |                                        | 
        32 -> 63         : 0        |                                        | 
        64 -> 127        : 0        |                                        | 
       128 -> 255        : 0        |                                        | 
       256 -> 511        : 0        |                                        | 
       512 -> 1023       : 58       |                                        | 
      1024 -> 2047       : 22763    |****************************************| 
      2048 -> 4095       : 13238    |***********************                 | 

I feel it is very common to do this kind of things for analyzing the performance issue. For example, to easy the analysis
of softirq latency, softirq.c has the below code:

asmlinkage __visible void __softirq_entry __do_softirq(void)
{
	...
		trace_softirq_entry(vec_nr);
		h->action(h);
		trace_softirq_exit(vec_nr);
	...
}

> 
> Will

Thanks
Barry




More information about the linux-arm-kernel mailing list