[PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU

Midgy Balon midgy971 at gmail.com
Fri Apr 3 07:02:02 PDT 2026


 From: Midgy BALON <midgy971 at gmail.com>
 To: Simon Xue <xxm at rock-chips.com>
 Cc: Jonas Karlman <jonas at kwiboo.se>, iommu at lists.linux.dev,
     joro at 8bytes.org, will at kernel.org, robin.murphy at arm.com,
     heiko at sntech.de, linux-arm-kernel at lists.infradead.org,
     linux-rockchip at lists.infradead.org, linux-kernel at vger.kernel.org,
     stable at vger.kernel.org
 In-Reply-To: <5663593b-2c53-4632-ad2c-db9efa8e9ab2 at rock-chips.com>
 References: <20260331075010.1463-1-midgy971 at gmail.com>
             <0f285782-b12a-4abd-bca7-b6c549bed59f at rock-chips.com>
             <e622cc9e-8fb0-454a-b88e-dc13cf2ff507 at kwiboo.se>
             <89ed223d-1a2c-447d-9f21-76969e668855 at rock-chips.com>
             <5663593b-2c53-4632-ad2c-db9efa8e9ab2 at rock-chips.com>
 Subject: Re: [PATCH] iommu/rockchip: fix page table allocation flags
for v2 IOMMU

 On 4/3/2026, Simon Xue wrote:
 > We internally checked that the RK356x SoCs integrate two different
 > IOMMU versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU.
 >
 > Both versions can map 40-bit physical pages, but v1.0 does not support
 > placing the first-level page table above 4 GB.
 >
 > To fix this, I think we need to land this patch first:
 > https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/
 >
 > Then on top of that, we can add a new compatible string to distinguish
 > the IOMMU versions.

 Thank you Simon and Jonas for the internal investigation. This explains
 exactly what I observed.

 To answer Simon's earlier question: the IP block hitting both failure
 modes is the NPU IOMMU (rknpu_mmu, at 0xfde4b000), currently bound
 to "rockchip,rk3568-iommu" in rk356x-base.dtsi. Both the downstream
 rknpu driver and the upstream Rocket accel driver (drivers/accel/rocket/)
 use this IOMMU.

 The v1.0 first-level page table constraint explains both failure modes.
 On boards with more than 4 GB of RAM the DTE table can be allocated
 above 0x100000000, and the v1.0 hardware silently truncates or errors
 on that address. The SWIOTLB bounce-buffer path is a consequence of
 the same root cause: with DMA_BIT_MASK(32) on the NPU device, bounce
 buffers land below 4 GB, phys_to_virt() of the bounce address is then
 used as the PTE write target, and the subsequent
 dma_sync_single_for_device(DMA_TO_DEVICE) overwrites those PTEs with
 zeros from the original buffer.

 Please consider my original patch withdrawn. Modifying iommu_data_ops_v2
 was too broad and would have incorrectly constrained VOP2 MMU and all
 other v2 IOMMU users.

 I agree fully with the two-step approach. On top of your per-device-ops
 patch [1], I plan to send:

   [1/2] iommu/rockchip: Add "rockchip,rk3568-iommu-v1" compatible
         for IOMMU v1.0 blocks (NPU, ISP/VICAP) on RK3568
         — ops with .gfp_flags = GFP_DMA32,
                .dma_bit_mask = DMA_BIT_MASK(40)
         (v1.0 can still map 40-bit physical pages; only the DTE
         table base must be below 4 GB)
   [2/2] arm64: dts: rockchip: rk356x: Use "rockchip,rk3568-iommu-v1"
         for rknpu_mmu (0xfde4b000) and vicap_mmu (0xfdfe0800)

 One note on the SWIOTLB issue: with GFP_DMA32 in the new ops, page
 table allocations never reach SWIOTLB, so the "track L2 base addresses"
 approach you suggested should not be necessary — GFP_DMA32 prevents the
 bounce-buffer poisoning at the source. Happy to be corrected if there
 is another path where it is still needed.

 I am happy to add Tested-by to your per-device-ops patch [1].

 [1] https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/

 Regards,
 Midgy BALON

Le ven. 3 avr. 2026 à 06:40, Simon Xue <xxm at rock-chips.com> a écrit :
>
>
> 在 2026/4/1 18:22, Simon Xue 写道:
> > Hi Jonas,
> >
> > 在 2026/4/1 16:41, Jonas Karlman 写道:
> >> Hi Simon,
> >>
> >> On 4/1/2026 9:48 AM, Simon wrote:
> >>> Hi Midgy,
> >>>
> >>> 在 2026/3/31 15:50, Midgy BALON 写道:
> >>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> >>>> available memory for IOMMU v2") removed GFP_DMA32 from
> >>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
> >>>> supports up to 40-bit physical addresses for page tables. However, the
> >>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
> >>>> physical addresses above 4 GB regardless of the address encoding
> >>>> range.
> >>>>
> >>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
> >>>> GFP_DMA32 causes two distinct failure modes:
> >>>>
> >>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
> >>>>      memory above 0x100000000.  The hardware page-table walker
> >>>> issues a
> >>>>      bus error trying to dereference those addresses, causing an IOMMU
> >>>>      fault on the first DMA transaction.
> >>> Which IP block is hitting this? We'd like to take a look on our end.
> >> I have seen reports that the NPU MMU on RK3568/RK3566 is having some
> >> issue using DTE/PTE with >32-bit addresses, maybe it uses a different
> >> MMU hw revision or has some hw errata?
> >>
> >>  From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was
> >> able to handle 40-bit addressable DTE/PTE, hence the original commit
> >> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available
> >> memory for IOMMU v2").
> >>
> >> As also mentioned in my reply at [1], maybe the NPU MMU has some hw
> >> limitation or errata and may need to use a different compatible.
> >
> > Yes,  We are checking internally whether different IOMMU versions
> > integrated.
> >
> > I will share what we find once we have results.
> >
> We internally checked that the RK356x SoCs integrate two different IOMMU
> versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU.
>
> Both versions can map 40-bit physical pages, but v1.0 does not support
> placing the first-level page table above 4 GB.
>
> To fix this, I think we need to land this patch first:
> https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/
>
> Then on top of that, we can add a new compatible string to distinguish
> the IOMMU versions.
>
> >> [1]
> >> https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/
> >>
> >> Regards,
> >> Jonas
> >>
> >>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables
> >>>> land
> >>>>      above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
> >>>>      then bounces them into a buffer below 4 GB.
> >>>> rk_dte_get_page_table()
> >>>>      returns phys_to_virt() of the bounce buffer address; PTEs are
> >>>> written
> >>>>      there; the next dma_sync_single_for_device(DMA_TO_DEVICE)
> >>>> copies the
> >>>>      original (zero) data back over the bounce buffer, silently
> >>>> erasing the
> >>>>      freshly written PTEs.  The IOMMU faults because every PTE
> >>>> reads as zero.
> >>> This probably need a separate patch. One way to fix it would be to
> >>> track the
> >>> original L2 page table base addresses in struct rk_iommu_domain,
> >>> then have rk_dte_get_page_table() return the tracked address instead of
> >>> deriving it from the DTE.
> >>>
> >>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
> >>>> currently only serves "rockchip,rk3568-iommu" in mainline.
> >>>>
> >>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
> >>>>     - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
> >>>>     - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
> >>>>     - No IOMMU faults, correct inference results
> >>>>
> >>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> >>>> available memory for IOMMU v2")
> >>>> Cc: stable at vger.kernel.org
> >>>> Cc: Jonas Karlman <jonas at kwiboo.se>
> >>>> Signed-off-by: Midgy BALON <midgy971 at gmail.com>
> >>>> ---
> >>>>    drivers/iommu/rockchip-iommu.c | 4 ++--
> >>>>    1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/iommu/rockchip-iommu.c
> >>>> b/drivers/iommu/rockchip-iommu.c
> >>>> index 85f3667e797..8b45db29471 100644
> >>>> --- a/drivers/iommu/rockchip-iommu.c
> >>>> +++ b/drivers/iommu/rockchip-iommu.c
> >>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
> >>>>        .pt_address = &rk_dte_pt_address_v2,
> >>>>        .mk_dtentries = &rk_mk_dte_v2,
> >>>>        .mk_ptentries = &rk_mk_pte_v2,
> >>>> -    .dma_bit_mask = DMA_BIT_MASK(40),
> >>>> -    .gfp_flags = 0,
> >>>> +    .dma_bit_mask = DMA_BIT_MASK(32),
> >>>> +    .gfp_flags = GFP_DMA32,
> >>>>    };
> >>>>       static const struct of_device_id rk_iommu_dt_ids[] = {
> >>



More information about the Linux-rockchip mailing list