[PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
Simon Xue
xxm at rock-chips.com
Wed Apr 1 03:22:55 PDT 2026
Hi Jonas,
在 2026/4/1 16:41, Jonas Karlman 写道:
> Hi Simon,
>
> On 4/1/2026 9:48 AM, Simon wrote:
>> Hi Midgy,
>>
>> 在 2026/3/31 15:50, Midgy BALON 写道:
>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
>>> available memory for IOMMU v2") removed GFP_DMA32 from
>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
>>> supports up to 40-bit physical addresses for page tables. However, the
>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
>>> physical addresses above 4 GB regardless of the address encoding range.
>>>
>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
>>> GFP_DMA32 causes two distinct failure modes:
>>>
>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>>> memory above 0x100000000. The hardware page-table walker issues a
>>> bus error trying to dereference those addresses, causing an IOMMU
>>> fault on the first DMA transaction.
>> Which IP block is hitting this? We'd like to take a look on our end.
> I have seen reports that the NPU MMU on RK3568/RK3566 is having some
> issue using DTE/PTE with >32-bit addresses, maybe it uses a different
> MMU hw revision or has some hw errata?
>
> From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was
> able to handle 40-bit addressable DTE/PTE, hence the original commit
> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available
> memory for IOMMU v2").
>
> As also mentioned in my reply at [1], maybe the NPU MMU has some hw
> limitation or errata and may need to use a different compatible.
Yes, We are checking internally whether different IOMMU versions
integrated.
I will share what we find once we have results.
> [1] https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/
>
> Regards,
> Jonas
>
>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
>>> above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32)
>>> then bounces them into a buffer below 4 GB. rk_dte_get_page_table()
>>> returns phys_to_virt() of the bounce buffer address; PTEs are written
>>> there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
>>> original (zero) data back over the bounce buffer, silently erasing the
>>> freshly written PTEs. The IOMMU faults because every PTE reads as zero.
>> This probably need a separate patch. One way to fix it would be to track the
>> original L2 page table base addresses in struct rk_iommu_domain,
>> then have rk_dte_get_page_table() return the tracked address instead of
>> deriving it from the DTE.
>>
>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
>>> currently only serves "rockchip,rk3568-iommu" in mainline.
>>>
>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>>> - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>>> - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>>> - No IOMMU faults, correct inference results
>>>
>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
>>> Cc: stable at vger.kernel.org
>>> Cc: Jonas Karlman <jonas at kwiboo.se>
>>> Signed-off-by: Midgy BALON <midgy971 at gmail.com>
>>> ---
>>> drivers/iommu/rockchip-iommu.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
>>> index 85f3667e797..8b45db29471 100644
>>> --- a/drivers/iommu/rockchip-iommu.c
>>> +++ b/drivers/iommu/rockchip-iommu.c
>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>>> .pt_address = &rk_dte_pt_address_v2,
>>> .mk_dtentries = &rk_mk_dte_v2,
>>> .mk_ptentries = &rk_mk_pte_v2,
>>> - .dma_bit_mask = DMA_BIT_MASK(40),
>>> - .gfp_flags = 0,
>>> + .dma_bit_mask = DMA_BIT_MASK(32),
>>> + .gfp_flags = GFP_DMA32,
>>> };
>>>
>>> static const struct of_device_id rk_iommu_dt_ids[] = {
>
More information about the Linux-rockchip
mailing list