[PATCH] iommu/arm-smmu-v3: add tracepoints for cmdq_issue_cmdlist

Robin Murphy robin.murphy at arm.com
Fri Aug 28 07:18:17 EDT 2020


On 2020-08-28 12:02, Song Bao Hua (Barry Song) wrote:
> 
> 
>> -----Original Message-----
>> From: Will Deacon [mailto:will at kernel.org]
>> Sent: Friday, August 28, 2020 10:29 PM
>> To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>
>> Cc: iommu at lists.linux-foundation.org; linux-arm-kernel at lists.infradead.org;
>> robin.murphy at arm.com; joro at 8bytes.org; Linuxarm <linuxarm at huawei.com>
>> Subject: Re: [PATCH] iommu/arm-smmu-v3: add tracepoints for
>> cmdq_issue_cmdlist
>>
>> On Thu, Aug 27, 2020 at 09:33:51PM +1200, Barry Song wrote:
>>> cmdq_issue_cmdlist() is the hotspot that uses a lot of time. This patch
>>> adds tracepoints for it to help debug.
>>>
>>> Signed-off-by: Barry Song <song.bao.hua at hisilicon.com>
>>> ---
>>>   * can furthermore develop an eBPF program to benchmark using this trace
>>
>> Hmm, don't these things have a history of becoming ABI? If so, I don't
>> really want them in the driver at all, sorry. Do other drivers overcome
>> this somehow?
> 
> This kind of tracepoints mainly works as a low-overhead probe point for debug purpose. I don't think any
> application would depend on it. It is for debugging. And there are lots of tracepoints in other drivers
> even in iommu driver core and intel_iommu driver :-)
> 
> developers use it in one of the below ways:
> 
> 1. get trace print from the ring buffer by reading debugfs
> root at ubuntu:/sys/kernel/debug/tracing/events/arm_smmu_v3# echo 1 > enable
> # cat /sys/kernel/debug/tracing/trace_pipe
> <idle>-0     [058] ..s1 125444.768083: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1
>            <idle>-0     [058] ..s1 125444.768084: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1
>            <idle>-0     [058] ..s1 125444.768085: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1
>            <idle>-0     [058] ..s1 125444.768165: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1
>            <idle>-0     [058] ..s1 125444.768168: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1
>            <idle>-0     [058] ..s1 125444.768169: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1
>            <idle>-0     [058] ..s1 125444.768171: issue_cmdlist_exit: arm-smmu-v3.2.auto cmd number=1 sync=1
>            <idle>-0     [058] ..s1 125444.768259: issue_cmdlist_entry: arm-smmu-v3.2.auto cmd number=1 sync=1
>            ...
> 
> This can replace printk with much much lower overhead.
> 
> 2. add a hook function in tracepoint to do some latency measure and time statistics just like the eBPF example
> I gave after the commit log.
> 
> Using it, I can get the histogram of the execution time of cmdq_issue_cmdlist():
>     nsecs               : count     distribution
>           0 -> 1          : 0        |                                        |
>           2 -> 3          : 0        |                                        |
>           4 -> 7          : 0        |                                        |
>           8 -> 15         : 0        |                                        |
>          16 -> 31         : 0        |                                        |
>          32 -> 63         : 0        |                                        |
>          64 -> 127        : 0        |                                        |
>         128 -> 255        : 0        |                                        |
>         256 -> 511        : 0        |                                        |
>         512 -> 1023       : 58       |                                        |
>        1024 -> 2047       : 22763    |****************************************|
>        2048 -> 4095       : 13238    |***********************                 |
> 
> I feel it is very common to do this kind of things for analyzing the performance issue. For example, to easy the analysis
> of softirq latency, softirq.c has the below code:
> 
> asmlinkage __visible void __softirq_entry __do_softirq(void)
> {
> 	...
> 		trace_softirq_entry(vec_nr);
> 		h->action(h);
> 		trace_softirq_exit(vec_nr);
> 	...
> }

If you only want to measure entry and exit of one specific function, 
though, can't the function graph tracer already do that?

Otherwise, pursuing optprobes sounds like a worthwhile thing to do since 
that should benefit everyone, rather than just the 6 people on the 
planet who might care about arm_smmu_issue_cmdlist(). As long as it 
doesn't involve whole new ISA extensions like the RISC-V proposal ;)

Robin.



More information about the linux-arm-kernel mailing list