[PATCH v2 0/3] arm-smmu: select suitable IOVA
Shyam Saini
shyamsaini at linux.microsoft.com
Thu May 29 11:22:19 PDT 2025
On Tue, May 27, 2025 at 09:04:25PM -0300, Jason Gunthorpe wrote:
> On Tue, May 27, 2025 at 01:54:28PM -0700, Shyam Saini wrote:
> > > The above is the only place that creates a IOMMU_RESV_SW_MSI so it is
> > > definately called and used, right? If not where does your
> > > IOMMU_RESV_SW_MSI come from?
> >
> > code tracing and printks in that code path suggests iommu_dma_get_resv_regions()
> > called by vfio-pci driver,
>
> Yes, I know it is, that is how it setups the SW_MSI.
>
> > > As above, I've asked a few times now if your resv_regions() is
> > > correct, meaning there is a reserved range covering the address space
> > > that doesn't have working translation. That means
> > > iommu_get_resv_regions() returns such a range.
> >
> > sorry about missing that, i see msi iova being reserved:
> >
> > cat /sys/kernel/iommu_groups/*/reserved_regions
> > 0x0000000008000000 0x00000000080fffff msi
> > 0x0000000008000000 0x00000000080fffff msi
> > 0x0000000008000000 0x00000000080fffff msi
> > 0x0000000008000000 0x00000000080fffff msi
> > [output trimmed]
>
> But this does not seem correct, you should have a "reserved" region
> covering 0x8000000 as well because you say your platform cannot do DMA
> to 0x8000000 and this is why you are doing all this.
>
> All IOVA that the platform cannot DMA from should be reported in the
> reserved_regions file as "reserved". You must make your platform
> achieve this.
so should it be for all the iommu groups?
no_dma_mem {
reg = <0x0 0x8000000 0x0 0x100000>;
no-map;
};
i think that's how we reserve memory in general, eg: ramoops
but this doesn't show up in:
/sys/kernel/iommu_groups/*/reserved_regions
> > Yes, i tried that,
> >
> > This is how my dts node looked like
> > reserved-memory {
> > faulty_iova: resv_faulty {
> > iommu-addresses = <&pcieX 0x8000000 0x100000>;
> > };
> > ..
> > ..
> > }
> >
> > &pcieX {
> > memory-region = <&faulty_iova>;
> > };
> >
> > I see it working for the devices which are calling
> > iommu_get_resv_regions(), eg if I specify faulty_iova for dma
> > controller dts node then i see an additional entry in the related
> > group
>
> Exactly, it has to flow from the DT into the reserved_regions, that is
> essential.
> So what is the problem if you have figured out how to fix up
> /sys/kernel/iommu_groups/Y/reserved_regions?
sorry, i haven't yet
> If you found some cases where you can't get /sys/../reserved_regions
> to report the right things from the DT then that needs to be addressed
> first before you think about fixing SW_MSI.
>
> I very vaguely recall we have some gaps on OF where the DMA-API code
> is understanding parts of the DT that don't get mapped into
> reserved_regions and nobody has cared to fix it because it only
> effects VFIO. You may have landed in the seat that has to fix it :)
I think this is the case we are dealing with?
> But I still don't have a clear sense of what your actual problem is as
> you are show DT that seems reasonable and saying that
> /sys/../reserved_regions is working..
/sys/../reserved_regions is working for certain devices like dma controller
but it doesn't work for pcie devices and its vfio-pcie driver calling
iommu_get_resv_regions() but we don't have dts node for vfio.
I have confirmed this about pcie on two different platforms, it seems to be
OF DMA-API gap that you hinted above, happy to work on that :), it would be
great if you can share any other reference discussions to that problem
When i specify this for dma controller:
faulty_iova: resv_faulty {
iommu-addresses = <&dmaX 0x8000000 0x100000>;
};
&dmaX {
memory-region = <&faulty_iova>;
};
I see following:
$ cat /sys/kernel/iommu_groups/y/reserved_regions
0x0000000008000000 0x00000000080fffff reserved
0x00000000a0000000 0x00000000a00fffff msi
Clarifying the Issue with MSI and SMMU Faults on Our Platform:
We are encountering SMMU faults when using our userspace tool/driver that
relies on MSI. Specifically, the issue arises when the MSI_IOVA_BASE is set
to the current default value of 0x08000000.
The observed fault is as follows:
arm-smmu 64000000.iommu: Unhandled context fault: fsr=0x402, iova=0x00000040,
fsynr=0x2f0013, cbfrsynra=0x102, cb=15
Upon investigation, our hardware team confirmed that the memory region
containing 0x08000000 is already mapped for other peripherals, making it
unavailable for MSI usage.
eg: using 0xa0000000 as MSI_IOVA_BASE solves our problem.
let me know if you have any other questions
Thanks,
Shyam
More information about the linux-arm-kernel
mailing list