[PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
Jason Gunthorpe
jgg at nvidia.com
Wed Sep 7 17:43:29 PDT 2022
On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
> > > FWIW, we're now very close to being able to validate dev->iommu against
> > > where the domain came from in core code, and so short-circuit ->attach_dev
> > > entirely if they don't match.
> >
> > I don't think this is a long term direction. We have systems now with
> > a number of SMMU blocks and we really are going to see a need that
> > they share the iommu_domains so we don't have unncessary overheads
> > from duplicated io page table memory.
> >
> > So ultimately I'd expect to pass the iommu_domain to the driver and
> > the driver will decide if the page table memory it represents is
> > compatible or not. Restricting to only the same iommu instance isn't
> > good..
>
> Who said IOMMU instance?
Ah, I completely misunderstood what 'dev->iommu' was referring too, OK
I see.
> Again, not what I was suggesting. In fact the nature of iommu_attach_group()
> already rules out bogus devices getting this far, so all a driver currently
> has to worry about is compatibility of a device that it definitely probed
> with a domain that it definitely allocated. Therefore, from a caller's point
> of view, if attaching to an existing domain returns -EINVAL, try another
> domain; multiple different existing domains can be tried, and may also
> return -EINVAL for the same or different reasons; the final attempt is to
> allocate a fresh domain and attach to that, which should always be nominally
> valid and *never* return -EINVAL. If any attempt returns any other error,
> bail out down the usual "this should have worked but something went wrong"
> path. Even if any driver did have a nonsensical "nothing went wrong, I just
> can't attach my device to any of my domains" case, I don't think it would
> really need distinguishing from any other general error anyway.
The algorithm you described is exactly what this series does, it just
used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
fundamental problem, just a bit more work.
Looking at Nicolin's series there is a bunch of existing errnos that
would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
ENXIO are all returned as codes for 'domain incompatible with device'
in various drivers. So the patch would still look much the same, just
changing them to EINVAL instead of EMEDIUMTYPE.
That leaves the question of the remaining EINVAL's that Nicolin did
not convert to EMEDIUMTYPE.
eg in the AMD driver:
if (!check_device(dev))
return -EINVAL;
iommu = rlookup_amd_iommu(dev);
if (!iommu)
return -EINVAL;
These are all cases of 'something is really wrong with the device or
iommu, everything will fail'. Other drivers are using ENODEV for this
already, so we'd probably have an additional patch changing various
places like that to ENODEV.
This mixture of error codes is the basic reason why a new code was
used, because none of the existing codes are used with any
consistency.
But OK, I'm on board, lets use more common errnos with specific
meaning, that can be documented in a comment someplace:
ENOMEM - out of memory
ENODEV - no domain can attach, device or iommu is messed up
EINVAL - the domain is incompatible with the device
<others> - Same behavior as ENODEV, use is discouraged.
I think achieving consistency of error codes is a generally desirable
goal, it makes the error code actually useful.
Joerg this is a good bit of work, will you be OK with it?
> Thus as long as we can maintain that basic guarantee that attaching
> a group to a newly allocated domain can only ever fail for resource
> allocation reasons and not some spurious "incompatibility", then we
> don't need any obscure trickery, and a single, clear, error code is
> in fact enough to say all that needs to be said.
As above, this is not the case, drivers do seem to have error paths
that are unconditional on the domain. Perhaps they are just protective
assertions and never happen.
Regardless, it doesn't matter. If they return ENODEV or EINVAL the
VFIO side algorithm will continue to work fine, it just does alot more
work if EINVAL is permanently returned.
Thanks,
Jason
More information about the linux-arm-kernel
mailing list