[PATCH 0/3] IOVA allocation improvements for iommu-dma
Sunil Kovvuri
sunil.kovvuri at gmail.com
Thu Mar 16 06:18:57 PDT 2017
On Wed, Mar 15, 2017 at 7:03 PM, Robin Murphy <robin.murphy at arm.com> wrote:
> Hi all,
>
> Here's the first bit of lock contention removal to chew on - feedback
> welcome! Note that for the current users of the io-pgtable framework,
> this is most likely to simply push more contention onto the io-pgtable
> lock, so may not show a great improvement alone. Will and I both have
> rough proof-of-concept implementations of lock-free io-pgtable code
> which we need to sit down and agree on at some point, hopefullt fairly
> soon.
Thanks for working on this.
As you said, it's indeed pushing lock contention down to pgtable lock from
iova rbtree lock but now morethan lock I see issue is with yielding CPU while
waiting for tlb_sync. Below are some numbers.
I have tweaked '__arm_smmu_tlb_sync' in SMMUv2 driver i.e basically removed
cpu_relax() and udelay() to make it a busy loop.
Before: 1.1 Gbps
With your patches: 1.45Gbps
With your patches + busy loop in tlb_sync: 7Gbps
If we reduce pgtable contention a bit
With your patches + busy loop in tlb_sync + Iperf threads reduced to 8
from 16: ~9Gbps
So looks like along with pgtable lock, some optimization can be done
to tlb_sync code as well.
Thanks,
Sunil.
More information about the linux-arm-kernel
mailing list