[PATCH 0/8] io-pgtable lock removal
Ray Jui
ray.jui at broadcom.com
Wed Jul 5 16:24:22 PDT 2017
Hi Will,
On 7/5/17 1:41 AM, Will Deacon wrote:
> On Tue, Jul 04, 2017 at 06:45:17PM -0700, Ray Jui wrote:
>> Hi Will/Robin,
>>
>> Has anything functionally changed between PATCH v2 and v1? I'm seeing a
>> very different L2 throughput with v2 (in general a lot worse with v2 vs.
>> v1); however, I'm currently unable to reproduce the TLB sync timed out
>> issue with v2 (without the patch from Will's email).
>>
>> It could also be something else that has changed in my setup, but so far
>> I have not yet been able to spot anything wrong in the setup.
>
> There were fixes, and that initially involved a DSB that was found to be
> expensive. The patches queued in -next should have that addressed, so please
> use those (or my for-joerg/arm-smmu/updates branch).
>
> Will
>
That was my bad yesterday. I was in a rush and the setup was incorrect.
I redo my Ethernet performance test with both PATCH v1 and v2 today, and
can confirm the performance is consistent between v1 and v2 as expected.
I also made sure the following message can still be reproduced with
patch set v2:
arm-smmu 64000000.mmu: TLB sync timed out -- SMMU may be deadlocked
Then I proceeded to apply your patch that attempt to fix the deadlock
issue. I also added a print to ensure I'm running the correct build with
your fix patch applied:
diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index cd8d7aa..01a6fa8 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -60,6 +60,7 @@ struct io_pgtable_ops *alloc_io_pgtable_ops(enum
io_pgtable_fmt fmt,
iop->cfg = *cfg;
atomic_set(&iop->tlb_sync_pending, 0);
+ pr_err("tlb sync pending cleared\n");
return &iop->ops;
}
root at bcm958742k:~# dmesg | grep tlb
[ 6.495754] tlb sync pending cleared
[ 6.509934] tlb sync pending cleared
[ 6.510067] tlb sync pending cleared
[ 6.510207] tlb sync pending cleared
[ 9.864543] tlb sync pending cleared
[ 9.874019] tlb sync pending cleared
[ 9.979311] tlb sync pending cleared
[ 39.616465] tlb sync pending cleared
However, with the fix patch, I can still see the deadlock message when I
have > 32 iperf TX threads active in the system:
root at bcm958742k:~# iperf -c 192.168.1.20 -P64
------------------------------------------------------------
Client connecting to 192.168.1.20, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 66] local 192.168.1.10 port 48802 connected with 192.168.1.20 port 5001
[ 6] local 192.168.1.10 port 48680 connected with 192.168.1.20 port 5001
[ 22] local 192.168.1.10 port 48710 connected with 192.168.1.20 port 5001
[ 50] local 192.168.1.10 port 48770 connected with 192.168.1.20 port 5001
[ 32] local 192.168.1.10 port 48734 connected with 192.168.1.20 port 5001
[ 23] local 192.168.1.10 port 48716 connected with 192.168.1.20 port 5001
[ 21] local 192.168.1.10 port 48712 connected with 192.168.1.20 port 5001
[ 10] local 192.168.1.10 port 48688 connected with 192.168.1.20 port 5001
[ 56] local 192.168.1.10 port 48782 connected with 192.168.1.20 port 5001
[ 31] local 192.168.1.10 port 48732 connected with 192.168.1.20 port 5001
[ 63] local 192.168.1.10 port 48796 connected with 192.168.1.20 port 5001
[ 58] local 192.168.1.10 port 48786 connected with 192.168.1.20 port 5001
[ 19] local 192.168.1.10 port 48706 connected with 192.168.1.20 port 5001
[ 47] local 192.168.1.10 port 48764 connected with 192.168.1.20 port 5001
[ 25] local 192.168.1.10 port 48720 connected with 192.168.1.20 port 5001
[ 34] local 192.168.1.10 port 48738 connected with 192.168.1.20 port 5001
[ 64] local 192.168.1.10 port 48798 connected with 192.168.1.20 port 5001
[ 52] local 192.168.1.10 port 48774 connected with 192.168.1.20 port 5001
[ 59] local 192.168.1.10 port 48788 connected with 192.168.1.20 port 5001
[ 30] local 192.168.1.10 port 48730 connected with 192.168.1.20 port 5001
[ 65] local 192.168.1.10 port 48800 connected with 192.168.1.20 port 5001
[ 17] local 192.168.1.10 port 48702 connected with 192.168.1.20 port 5001
[ 20] local 192.168.1.10 port 48708 connected with 192.168.1.20 port 5001
[ 44] local 192.168.1.10 port 48758 connected with 192.168.1.20 port 5001
[ 55] local 192.168.1.10 port 48780 connected with 192.168.1.20 port 5001
[ 33] local 192.168.1.10 port 48736 connected with 192.168.1.20 port 5001
[ 62] local 192.168.1.10 port 48794 connected with 192.168.1.20 port 5001
[ 60] local 192.168.1.10 port 48790 connected with 192.168.1.20 port 5001
[ 14] local 192.168.1.10 port 48696 connected with 192.168.1.20 port 5001
[ 28] local 192.168.1.10 port 48726 connected with 192.168.1.20 port 5001
[ 53] local 192.168.1.10 port 48776 connected with 192.168.1.20 port 5001
[ 42] local 192.168.1.10 port 48754 connected with 192.168.1.20 port 5001
[ 16] local 192.168.1.10 port 48700 connected with 192.168.1.20 port 5001
[ 3] local 192.168.1.10 port 48678 connected with 192.168.1.20 port 5001
[ 29] local 192.168.1.10 port 48728 connected with 192.168.1.20 port 5001
[ 27] local 192.168.1.10 port 48724 connected with 192.168.1.20 port 5001
[ 38] local 192.168.1.10 port 48746 connected with 192.168.1.20 port 5001
[ 13] local 192.168.1.10 port 48694 connected with 192.168.1.20 port 5001
[ 12] local 192.168.1.10 port 48692 connected with 192.168.1.20 port 5001
[ 41] local 192.168.1.10 port 48752 connected with 192.168.1.20 port 5001
[ 26] local 192.168.1.10 port 48722 connected with 192.168.1.20 port 5001
[ 11] local 192.168.1.10 port 48690 connected with 192.168.1.20 port 5001
[ 24] local 192.168.1.10 port 48718 connected with 192.168.1.20 port 5001
[ 15] local 192.168.1.10 port 48698 connected with 192.168.1.20 port 5001
[ 37] local 192.168.1.10 port 48744 connected with 192.168.1.20 port 5001
[ 36] local 192.168.1.10 port 48742 connected with 192.168.1.20 port 5001
[ 43] local 192.168.1.10 port 48756 connected with 192.168.1.20 port 5001
[ 48] local 192.168.1.10 port 48766 connected with 192.168.1.20 port 5001
[ 45] local 192.168.1.10 port 48760 connected with 192.168.1.20 port 5001
[ 35] local 192.168.1.10 port 48740 connected with 192.168.1.20 port 5001
[ 7] local 192.168.1.10 port 48672 connected with 192.168.1.20 port 5001
[ 39] local 192.168.1.10 port 48748 connected with 192.168.1.20 port 5001
[ 40] local 192.168.1.10 port 48750 connected with 192.168.1.20 port 5001
[ 8] local 192.168.1.10 port 48682 connected with 192.168.1.20 port 5001
[ 18] local 192.168.1.10 port 48704 connected with 192.168.1.20 port 5001
[ 4] local 192.168.1.10 port 48674 connected with 192.168.1.20 port 5001
[ 46] local 192.168.1.10 port 48762 connected with 192.168.1.20 port 5001
[ 5] local 192.168.1.10 port 48676 connected with 192.168.1.20 port 5001
[ 49] local 192.168.1.10 port 48768 connected with 192.168.1.20 port 5001
[ 54] local 192.168.1.10 port 48778 connected with 192.168.1.20 port 5001
[ 57] local 192.168.1.10 port 48784 connected with 192.168.1.20 port 5001
[ 51] local 192.168.1.10 port 48772 connected with 192.168.1.20 port 5001
[ 9] local 192.168.1.10 port 48686 connected with 192.168.1.20 port 5001
[ 61] local 192.168.1.10 port 48792 connected with 192.168.1.20 port 5001
[ 698.284709] arm-smmu 64000000.mmu: TLB sync timed out -- SMMU may be
deadlocked
[ 699.386010] arm-smmu 64000000.mmu: TLB sync timed out -- SMMU may be
deadlocked
[ 702.064900] arm-smmu 64000000.mmu: TLB sync timed out -- SMMU may be
deadlocked
[ ID] Interval Transfer Bandwidth
[ 26] 0.0-10.0 sec 544 MBytes 456 Mbits/sec
[ 6] 0.0-10.0 sec 382 MBytes 320 Mbits/sec
[ 22] 0.0-10.1 sec 667 MBytes 556 Mbits/sec
[ 50] 0.0-10.1 sec 245 MBytes 204 Mbits/sec
[ 21] 0.0-10.1 sec 291 MBytes 242 Mbits/sec
[ 56] 0.0-10.1 sec 256 MBytes 213 Mbits/sec
[ 19] 0.0-10.0 sec 17.0 MBytes 14.2 Mbits/sec
[ 47] 0.0-10.0 sec 357 MBytes 299 Mbits/sec
[ 52] 0.0-10.1 sec 121 MBytes 101 Mbits/sec
[ 59] 0.0-10.0 sec 364 MBytes 304 Mbits/sec
[ 30] 0.0-10.0 sec 469 MBytes 391 Mbits/sec
[ 20] 0.0-10.0 sec 435 MBytes 364 Mbits/sec
[ 44] 0.0-10.0 sec 379 MBytes 317 Mbits/sec
[ 33] 0.0-10.0 sec 468 MBytes 392 Mbits/sec
[ 60] 0.0-10.0 sec 178 MBytes 149 Mbits/sec
[ 14] 0.0-10.1 sec 539 MBytes 449 Mbits/sec
[ 28] 0.0-10.1 sec 60.6 MBytes 50.5 Mbits/sec
[ 42] 0.0-10.1 sec 365 MBytes 304 Mbits/sec
[ 3] 0.0-10.1 sec 109 MBytes 90.5 Mbits/sec
[ 29] 0.0-10.1 sec 473 MBytes 395 Mbits/sec
[ 38] 0.0-10.0 sec 254 MBytes 212 Mbits/sec
[ 13] 0.0-10.0 sec 523 MBytes 438 Mbits/sec
[ 12] 0.0-10.1 sec 182 MBytes 152 Mbits/sec
[ 11] 0.0-10.1 sec 130 MBytes 109 Mbits/sec
[ 15] 0.0-10.1 sec 174 MBytes 145 Mbits/sec
[ 43] 0.0-10.1 sec 399 MBytes 333 Mbits/sec
[ 48] 0.0-10.1 sec 543 MBytes 452 Mbits/sec
[ 45] 0.0-10.1 sec 69.1 MBytes 57.6 Mbits/sec
[ 35] 0.0-10.1 sec 54.0 MBytes 45.0 Mbits/sec
[ 4] 0.0-10.0 sec 116 MBytes 97.4 Mbits/sec
[ 46] 0.0-10.1 sec 300 MBytes 250 Mbits/sec
[ 51] 0.0-10.1 sec 49.8 MBytes 41.5 Mbits/sec
[ 61] 0.0-10.1 sec 102 MBytes 85.0 Mbits/sec
[ 23] 0.0-10.1 sec 1.64 GBytes 1.39 Gbits/sec
[ 10] 0.0-10.1 sec 210 MBytes 174 Mbits/sec
[ 31] 0.0-10.1 sec 1.16 GBytes 988 Mbits/sec
[ 63] 0.0-10.1 sec 468 MBytes 389 Mbits/sec
[ 25] 0.0-10.1 sec 457 MBytes 381 Mbits/sec
[ 34] 0.0-10.1 sec 332 MBytes 276 Mbits/sec
[ 64] 0.0-10.1 sec 280 MBytes 233 Mbits/sec
[ 17] 0.0-10.1 sec 425 MBytes 354 Mbits/sec
[ 62] 0.0-10.1 sec 616 MBytes 513 Mbits/sec
[ 53] 0.0-10.1 sec 289 MBytes 241 Mbits/sec
[ 16] 0.0-10.1 sec 661 MBytes 550 Mbits/sec
[ 27] 0.0-10.1 sec 298 MBytes 249 Mbits/sec
[ 41] 0.0-10.1 sec 11.5 MBytes 9.57 Mbits/sec
[ 37] 0.0-10.1 sec 945 MBytes 786 Mbits/sec
[ 36] 0.0-10.1 sec 164 MBytes 136 Mbits/sec
[ 40] 0.0-10.1 sec 782 MBytes 650 Mbits/sec
[ 8] 0.0-10.1 sec 883 MBytes 734 Mbits/sec
[ 18] 0.0-10.1 sec 140 MBytes 117 Mbits/sec
[ 5] 0.0-10.1 sec 366 MBytes 305 Mbits/sec
[ 49] 0.0-10.1 sec 229 MBytes 191 Mbits/sec
[ 54] 0.0-10.1 sec 884 MBytes 736 Mbits/sec
[ 57] 0.0-10.1 sec 56.6 MBytes 47.1 Mbits/sec
[ 9] 0.0-10.1 sec 72.8 MBytes 60.4 Mbits/sec
[ 66] 0.0-10.1 sec 170 MBytes 141 Mbits/sec
[ 32] 0.0-10.1 sec 201 MBytes 167 Mbits/sec
[ 58] 0.0-10.1 sec 381 MBytes 317 Mbits/sec
[ 65] 0.0-10.1 sec 373 MBytes 310 Mbits/sec
[ 55] 0.0-10.1 sec 98.0 MBytes 81.5 Mbits/sec
[ 24] 0.0-10.1 sec 292 MBytes 243 Mbits/sec
[ 7] 0.0-10.1 sec 1.08 GBytes 918 Mbits/sec
[ 39] 0.0-10.1 sec 95.8 MBytes 79.6 Mbits/sec
[SUM] 0.0-10.1 sec 23.2 GBytes 19.7 Gbits/sec
I played with it a bit and can confirm if I have all interrupt affinity
set to CPU0, I then do not see this issue. This tells us that there
still seem to be a race somewhere, when multiple CPUs are involved?
Regards,
Ray
More information about the linux-arm-kernel
mailing list