[PATCH v8 07/15] iommupt: Add map_pages op
Jason Gunthorpe
jgg at nvidia.com
Wed Jan 28 05:32:58 PST 2026
On Wed, Jan 28, 2026 at 12:42:08PM +1100, Alexey Kardashevskiy wrote:
> > > Nah, it is quite easy to force 2MB on swiotlb (just do it once and
> > > forget) but currently any guest page can be converted to shared and
> > > DMA-mapped and this skips swiotlb.
> >
> > Upstream Linux doesn't support that, only SWIOTLB or special DMA
> > coherent memory can be DMA mapped in CC systems. You can't take a
> > random page, make it shared and then DMA map it.
>
> Well, my test device driver calls dma_alloc_coherent() which does that - alloc + convert 4K.
Yes, and there is no reason that can't come from the same allocator as
SWIOTLB and use 2M aligned blocks.
> > What happens if you don't have a VIOMMU, have a single translation
> > stage and only use the S1 (AMDv2) page table in the hypervisor? Then
> > does the HW fix it? Or does it only fix it with two stages enabled?
>
> The HW translates a DMA handle to a host pfn, and then RMP checks if
> that [pfn..pfn+size] is assigned to the correct ASID and the page
> size matches and the gfn matches.
>
> RMP does not check S1 translations inside the guest, only S2. RMP is
> not fixing page sizes or anything, it says yes/no to the access.
Your explanation doesn't make alot of sense.
If we have a vIOMMU and the guest has a 4K IOPTE in S1 then it goes
S1[4k] -> S2[2M] -- [4k] --> RMP[2M] ==> OK 4k IOTLB entry
While if we have no vIOMMU, the same effective scenario:
S2[4k] ------- [4k] -------> RMP[2M] ==> FAIL
It makes no sense at all. Why build something like that?
It is not a "firewall" it is a huge software obstacle.
Maybe your answer is the entity that is building the RMP also has to
build a matching S2 IOTLB as one unit and we somehow just plumb the
page table pointer and invalidations into the IOMMU driver.
Such a messy design.
> > > > iommufd won't deal with memory maps for IO, the secure world will
> > > > handle that through KVM.
> > >
> > > Is QEMU going to skip on IOMMU mapping entirely? So when the device
> > > is transitioned from untrusted (when everything mapped via VFIO or
> > > IOMMU) to trusted - QEMU will unmap everything and then the guest
> > > will map everything but this time via KVM and bypassing QEMU
> > > entirely? Thanks,
> >
> > On ARM there are different S2s for the IOMMU, one for T=1 and one for
> > T=0 traffic. The T=1 is fully controlled by the secure world is equal
> > to the CPU S2. The T=0 one is fully controlled by qemu and acts like a
> > normal system. The T=0 can only access guest shared memory.
>
> Does the T=0 table still have all the guest memory mapped (with the
> expectation that what is not allowed - won't be accessed using that
> table)? Thanks,
I'm not sure what the plan is, I think ARM can do both ways - map all
guest physical and rely on the GPT to prevent access or dynamically
map only shared pages.
Jason
More information about the linux-riscv
mailing list