[PATCH v2 01/40] iommu: Introduce Shared Virtual Addressing API
jean-philippe.brucker at arm.com
Fri Sep 7 08:45:02 PDT 2018
On 07/09/2018 09:55, Christian König wrote:
> I will take this as an opportunity to summarize some of the requirements
> we have for PASID management from the amdgpu driver point of view:
That's incredibly useful, thanks :)
> 1. We need to be able to allocate PASID between 1 and some maximum. Zero
> is reserved as far as I know, but we don't necessary need a minimum.
Should be fine. The PASID range is restricted by the PCI PASID
capability, firmware description (for non-PCI devices), the IOMMU
capacity, and what the device driver passes to iommu_sva_device_init.
Not all IOMMUs reserve PASID 0 (AMD IOMMU without GIoSup doesn't, if I'm
not mistaken), so the KFD driver will need to pass min_pasid=1 to make
sure that 0 isn't allocated.
> 2. We need to be able to allocate PASIDs without a process address space
> backing it. E.g. our hardware uses PASIDs even without Shared Virtual
> Addressing enabled to distinct clients from each other.
> Would be a pity if we need to still have a separate PASID
> handling because the system wide is only available when IOMMU is turned on.
I'm still not sure about this one. From my point of view we shouldn't
add to the IOMMU subsystem helpers for devices without an IOMMU.
iommu-sva expects everywhere that the device has an iommu_domain, it's
the first thing we check on entry. Bypassing all of this would call
idr_alloc() directly, and wouldn't have any code in common with the
current iommu-sva. So it seems like you need a layer on top of iommu-sva
calling idr_alloc() when an IOMMU isn't present, but I don't think it
should be in drivers/iommu/
> 3. Even after destruction of a process address space we need some grace
> period before a PASID is reused because it can be that the specific
> PASID is still in some hardware queues etc...
> At bare minimum all device drivers using process binding need
> to explicitly note to the core when they are done with a PASID.
Right, much of the horribleness in iommu-sva deals with this:
The process dies, iommu-sva is notified and calls the mm_exit() function
passed by the device driver to iommu_sva_device_init(). In mm_exit() the
device driver needs to clear any reference to the PASID in hardware and
in its own structures. When the device driver returns from mm_exit(), it
effectively tells the core that it has finished using the PASID, and
iommu-sva can reuse the PASID for another process. mm_exit() is allowed
to block, so the device driver has time to clean up and flush the queues.
If the device driver finishes using the PASID before the process exits,
it just calls unbind().
> 4. It would be nice to have to be able to set a "void *" for each
> PASID/device combination while binding to a process which then can be
> queried later on based on the PASID.
> E.g. when you have a per PASID/device structure around anyway,
> just add an extra field.
iommu_sva_bind_device() takes a "drvdata" pointer that is stored
internally for the PASID/device combination (iommu_bond). It is passed
to mm_exit(), but I haven't added anything for the device driver to
query it back.
> 5. It would be nice to have to allocate multiple PASIDs for the same
> process address space.
> E.g. some teams at AMD want to use a separate GPU address space
> for their userspace client library. I'm still trying to avoid that, but
> it is perfectly possible that we are going to need that.
Two PASIDs pointing to the same process pgd? At first glance it seems
feasible, maybe with a flag passed to bind() and a few changes to
internal structures. It will duplicate ATC invalidation commands for
each process address space change (munmap etc) so you might take a
Intel's SVM code has the SVM_FLAG_PRIVATE_PASID which seems similar to
what you describe, but I don't plan to support it in this series (the
io_mm model is already pretty complicated). I think it can be added
without too much effort in a future series, though with a different flag
name since we'd like to use "private PASID" for something else
> Additional to that it is sometimes quite useful for debugging
> to isolate where exactly an incorrect access (segfault) is coming from.
> Let me know if there are some problems with that, especially I want to
> know if there is pushback on #5 so that I can forward that :)
More information about the linux-arm-kernel