[PATCH RFC v2 8/8] dma-iommu: Support DMA sync batch mode for iommu_dma_sync_sg_for_{cpu, device}

Robin Murphy robin.murphy at arm.com
Tue Jan 6 11:42:10 PST 2026


On 2025-12-27 8:59 pm, Barry Song wrote:
> On Sun, Dec 28, 2025 at 9:16 AM Leon Romanovsky <leon at kernel.org> wrote:
>>
>> On Sat, Dec 27, 2025 at 11:52:48AM +1300, Barry Song wrote:
>>> From: Barry Song <baohua at kernel.org>
>>>
>>> Apply batched DMA synchronization to iommu_dma_sync_sg_for_cpu() and
>>> iommu_dma_sync_sg_for_device(). For all buffers in an SG list, only
>>> a single flush operation is needed.
>>>
>>> I do not have the hardware to test this, so the patch is marked as
>>> RFC. I would greatly appreciate any testing feedback.
>>>
>>> Cc: Leon Romanovsky <leon at kernel.org>
>>> Cc: Marek Szyprowski <m.szyprowski at samsung.com>
>>> Cc: Catalin Marinas <catalin.marinas at arm.com>
>>> Cc: Will Deacon <will at kernel.org>
>>> Cc: Ada Couprie Diaz <ada.coupriediaz at arm.com>
>>> Cc: Ard Biesheuvel <ardb at kernel.org>
>>> Cc: Marc Zyngier <maz at kernel.org>
>>> Cc: Anshuman Khandual <anshuman.khandual at arm.com>
>>> Cc: Ryan Roberts <ryan.roberts at arm.com>
>>> Cc: Suren Baghdasaryan <surenb at google.com>
>>> Cc: Robin Murphy <robin.murphy at arm.com>
>>> Cc: Joerg Roedel <joro at 8bytes.org>
>>> Cc: Tangquan Zheng <zhengtangquan at oppo.com>
>>> Signed-off-by: Barry Song <baohua at kernel.org>
>>> ---
>>>   drivers/iommu/dma-iommu.c | 15 +++++++--------
>>>   1 file changed, 7 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>> index ffa940bdbbaf..b68dbfcb7846 100644
>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
>>> @@ -1131,10 +1131,9 @@ void iommu_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sgl,
>>>                        iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
>>>                                                      sg->length, dir);
>>>        } else if (!dev_is_dma_coherent(dev)) {
>>> -             for_each_sg(sgl, sg, nelems, i) {
>>> +             for_each_sg(sgl, sg, nelems, i)
>>>                        arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
>>> -                     arch_sync_dma_flush();
>>> -             }
>>> +             arch_sync_dma_flush();
>>
>> This and previous patches should be squashed into the one which
>> introduced arch_sync_dma_flush().
> 
> Hi Leon,
> 
> The series is structured to first introduce no functional change by
> replacing all arch_sync_dma_for_* calls with arch_sync_dma_for_* plus
> arch_sync_dma_flush(). Subsequent patches then add batching for
> different scenarios as separate changes.
> 
> Another issue is that I was unable to find a board that both runs
> mainline and exercises the IOMMU paths affected by these changes.
> As a result, patches 7 and 8 are marked as RFC, while the other
> patches have been tested on a real board running mainline + changes.

FWIW if you can get your hands on an M.2 NVMe for the Rock5 then that 
has an SMMU in front of PCIe (and could also work to test non-coherent 
SWIOTLB, with the SMMU in bypass and either some fake restrictive 
dma-ranges in the DT or a hack to reduce the DMA mask in the NVMe driver.)

Cheers,
Robin.



More information about the linux-arm-kernel mailing list