[PATCH v2 0/5] Convert riscv to use the generic iommu page table
Robin Murphy
robin.murphy at arm.com
Mon Feb 2 06:00:07 PST 2026
On 2026-01-31 12:27 am, Jason Gunthorpe wrote:
> On Thu, Jan 29, 2026 at 11:21:53AM +0000, Robin Murphy wrote:
>
>> The driver does not advertise IOMMU_CAP_DEFERRED_FLUSH, as the existing
>> pagetable code has never implemented the conditional TLB maintenance
>> optimisation that makes it meaningful. Mind you, I don't see any reference
>> to iommu_iotlb_gather_queued() in generic_pt either, so I have no idea how
>> that's trying to do it :/
>
> Well, correct me if I've got it wrong..
>
> DMA-FQ always does flush all, so the goal is to eliminate redundant
> flushes from within the page table logic itself.
>
> DMA-FQ requires two functionalites from the page table:
> 1) use gather->freelist to avoid a HW UAF (iommupt always does this)
Nope, correct DMA API usage would almost never unmap an entire table, so
synchronous non-leaf maintenance in that path still doesn't hurt DMA-FQ
either (e.g. io-pgtable-arm).
If a pagetable implementation wanted to refcount and eagerly free empty
tables upon leaf unmaps, then yes it would need deferred freeing, but
frankly it would be better off just not doing that at all for DMA-FQ
anyway (as IOVA caching would make it likely to need to repopulate the
same level of table soon.)
> 2) avoid internal calls to iommu_iotlb_sync()
More like avoid issuing any kind of inline TLB maintenance at all, e.g.
for SMMUv2 the significant work is done up-front in
io_pgtable_tlb_add_page(), so only skipping iommu_iotlb_sync() would
have very little benefit.
> When the gather reaches __iommu_dma_unmap() it discards any iova range
> inside it, queues the freelist, and queues a flush all. So any flush
> implied by the gather is removed by the core code.
Pretty much - the freelist is handed off to the FQ, which will only
release the IOVA range and free any pages after completion of a
flush_iotlb_all() at some point in future (iommu-dma still knows the
whole IOVA range it unmapped - gather->iova is only for the actual
gathering of TLB ops by the IOMMU implementation itself)
> iommu_iotlb_gather_queued() should be used in the page table to
> supress any internal flushes.
Yes.
> iommupt doesn't have any calls because it doesn't have any internal
> flushes. It calls iommu_iotlb_gather_add_range() which blindly updates
> the iova and never flushes.
For PT_FEAT_FLUSH_RANGE, I guess so.
> The one call to iommu_iotlb_sync() is only for the para-virtualization
> optimization of narrowing invalidations. It would be nonsensical for a
> driver to enable this optimization and offer IOMMU_CAP_DEFERRED_FLUSH.
Not necessarily - in the PV case it can be desirable to minimise
over-invalidation *if* you're trapping for targeted invalidations in
strict mode. However, depending on the usage pattern it may also be
beneficial to have non-strict let the FQ mechanism batch up work to
minimise the number of traps taken - e.g. s390 is in this situation, and
is precisely why we added IOMMU_DMA_OPTS_SINGLE_QUEUE to help optimise
for that.
> But it is a good point that riscv pagetable may not have supported it
> before, but it does now, so we should probably add
> IOMMU_CAP_DEFERRED_FLUSH. I'll send a patch next cycle.
Sounds good.
Thanks,
Robin.
More information about the linux-riscv
mailing list