[PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
Simon Xue
xxm at rock-chips.com
Thu Apr 2 21:40:27 PDT 2026
在 2026/4/1 18:22, Simon Xue 写道:
> Hi Jonas,
>
> 在 2026/4/1 16:41, Jonas Karlman 写道:
>> Hi Simon,
>>
>> On 4/1/2026 9:48 AM, Simon wrote:
>>> Hi Midgy,
>>>
>>> 在 2026/3/31 15:50, Midgy BALON 写道:
>>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
>>>> available memory for IOMMU v2") removed GFP_DMA32 from
>>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
>>>> supports up to 40-bit physical addresses for page tables. However, the
>>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
>>>> physical addresses above 4 GB regardless of the address encoding
>>>> range.
>>>>
>>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
>>>> GFP_DMA32 causes two distinct failure modes:
>>>>
>>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>>>> memory above 0x100000000. The hardware page-table walker
>>>> issues a
>>>> bus error trying to dereference those addresses, causing an IOMMU
>>>> fault on the first DMA transaction.
>>> Which IP block is hitting this? We'd like to take a look on our end.
>> I have seen reports that the NPU MMU on RK3568/RK3566 is having some
>> issue using DTE/PTE with >32-bit addresses, maybe it uses a different
>> MMU hw revision or has some hw errata?
>>
>> From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was
>> able to handle 40-bit addressable DTE/PTE, hence the original commit
>> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available
>> memory for IOMMU v2").
>>
>> As also mentioned in my reply at [1], maybe the NPU MMU has some hw
>> limitation or errata and may need to use a different compatible.
>
> Yes, We are checking internally whether different IOMMU versions
> integrated.
>
> I will share what we find once we have results.
>
We internally checked that the RK356x SoCs integrate two different IOMMU
versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU.
Both versions can map 40-bit physical pages, but v1.0 does not support
placing the first-level page table above 4 GB.
To fix this, I think we need to land this patch first:
https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/
Then on top of that, we can add a new compatible string to distinguish
the IOMMU versions.
>> [1]
>> https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/
>>
>> Regards,
>> Jonas
>>
>>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables
>>>> land
>>>> above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32)
>>>> then bounces them into a buffer below 4 GB.
>>>> rk_dte_get_page_table()
>>>> returns phys_to_virt() of the bounce buffer address; PTEs are
>>>> written
>>>> there; the next dma_sync_single_for_device(DMA_TO_DEVICE)
>>>> copies the
>>>> original (zero) data back over the bounce buffer, silently
>>>> erasing the
>>>> freshly written PTEs. The IOMMU faults because every PTE
>>>> reads as zero.
>>> This probably need a separate patch. One way to fix it would be to
>>> track the
>>> original L2 page table base addresses in struct rk_iommu_domain,
>>> then have rk_dte_get_page_table() return the tracked address instead of
>>> deriving it from the DTE.
>>>
>>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
>>>> currently only serves "rockchip,rk3568-iommu" in mainline.
>>>>
>>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>>>> - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>>>> - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>>>> - No IOMMU faults, correct inference results
>>>>
>>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
>>>> available memory for IOMMU v2")
>>>> Cc: stable at vger.kernel.org
>>>> Cc: Jonas Karlman <jonas at kwiboo.se>
>>>> Signed-off-by: Midgy BALON <midgy971 at gmail.com>
>>>> ---
>>>> drivers/iommu/rockchip-iommu.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/rockchip-iommu.c
>>>> b/drivers/iommu/rockchip-iommu.c
>>>> index 85f3667e797..8b45db29471 100644
>>>> --- a/drivers/iommu/rockchip-iommu.c
>>>> +++ b/drivers/iommu/rockchip-iommu.c
>>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>>>> .pt_address = &rk_dte_pt_address_v2,
>>>> .mk_dtentries = &rk_mk_dte_v2,
>>>> .mk_ptentries = &rk_mk_pte_v2,
>>>> - .dma_bit_mask = DMA_BIT_MASK(40),
>>>> - .gfp_flags = 0,
>>>> + .dma_bit_mask = DMA_BIT_MASK(32),
>>>> + .gfp_flags = GFP_DMA32,
>>>> };
>>>> static const struct of_device_id rk_iommu_dt_ids[] = {
>>
More information about the linux-arm-kernel
mailing list