[PATCH 4/5] iommu: Regulate errno in ->attach_dev callback functions

Nicolin Chen nicolinc at nvidia.com
Wed Sep 14 13:55:08 PDT 2022


Hi Robin,

On Wed, Sep 14, 2022 at 08:53:07PM +0100, Robin Murphy wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022-09-14 18:58, Nicolin Chen wrote:
> > On Wed, Sep 14, 2022 at 10:49:42AM +0100, Jean-Philippe Brucker wrote:
> > > External email: Use caution opening links or attachments
> > > 
> > > 
> > > On Wed, Sep 14, 2022 at 06:11:06AM -0300, Jason Gunthorpe wrote:
> > > > On Tue, Sep 13, 2022 at 01:27:03PM +0100, Jean-Philippe Brucker wrote:
> > > > > I think in the future it will be too easy to forget about the constrained
> > > > > return value of attach() while modifying some other part of the driver,
> > > > > and let an external helper return EINVAL. So I'd rather not propagate ret
> > > > > from outside of viommu_domain_attach() and finalise().
> > > > 
> > > > Fortunately, if -EINVAL is wrongly returned it only creates an
> > > > inefficiency, not a functional problem. So we do not need to be
> > > > precise here.
> > > 
> > > Ah fair. In that case the attach_dev() documentation should indicate that
> > > EINVAL is a hint, so that callers don't rely on it (currently words "must"
> > > and "exclusively" indicate that returning EINVAL for anything other than
> > > device-domain incompatibility is unacceptable). The virtio-iommu
> > > implementation may well return EINVAL from the virtio stack or from the
> > > host response.
> > 
> > How about this?
> > 
> > + * * EINVAL    - mainly, device and domain are incompatible, or something went
> > + *               wrong with the domain. It's suggested to avoid kernel prints
> > + *               along with this errno. And it's better to convert any EINVAL
> > + *               returned from kAPIs to ENODEV if it is device-specific, or to
> > + *               some other reasonable errno being listed below
> 
> FWIW, I'd say something like:
> 
> "The device and domain are incompatible. If this is due to some previous
> configuration of the domain, drivers should not log an error, since it
> is legitimate for callers to test reuse of an existing domain.
> Otherwise, it may still represent some fundamental problem."

OK. I will use this narrative.

> And then at the public interfaces state it from other angle:
> 
> "The device and domain are incompatible. If the domain has already been
> used or configured in some way, attaching the same device to a different
> domain may be expected to succeed. Otherwise, it may still represent
> some fundamental problem."

I assume this should go to kdocs of iommu_attach_device/group().

> [ and to save another mail, I'm not sure copying the default comment for
> ENOSPC is all that helpful either - what is "space" for something that
> isn't a storage device? I'd guess limited hardware resources in some
> form, but in the IOMMU context, potential confusion with address space
> is maybe a little too close for comfort? ]

How about "non-ENOMEM type of resource allocation failure"?

> > > > > Since we can't guarantee that APIs like virtio or ida won't ever return
> > > > > EINVAL, we should set all return values:
> > > > 
> > > > I dislike this alot, it squashes all return codes to try to optimize
> > > > an obscure failure path :(
> > 
> > Hmm...should I revert all the driver changes back to this version?
> 
> Yeah, I don't think we need to go too mad here. Drivers shouldn't emit
> their *own* -EINVAL unless appropriate, but if it comes back from some
> external API then that implies something's gone unexpectedly wrong
> anyway - maybe it's a transient condition and a subsequent different
> attach might actually work out OK? We can't really say in general.

OK. Then there's even no need to convert EINVAL to ENODEV.

> Besides, if the driver sees an error which implies it's done something
> wrong itself, it probably shouldn't be trusted to try to reason about it
> further. The caller can handle any error as long as we set their
> expectations correctly.

Yea. As Jason remarked, a wrongly returned EINVAL would make things
a bit inefficient: VFIO/IOMMUFD would keep trying attach_dev() with
its existing domain list and a new domain but fail all of them.

I will change things in v2 back to this 2-patch version, and maybe
limit a bit further the changes in the first NODEV patch.

Thanks!
Nic



More information about the linux-arm-kernel mailing list