[PATCH 8/8] iommu: Improve map/unmap sanity checks

Jason Gunthorpe jgg at ziepe.ca
Thu Sep 14 09:48:25 PDT 2023


On Thu, Sep 14, 2023 at 03:23:46PM +0100, Robin Murphy wrote:
> > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > > > index c83f2e4c56f5..391bcae4d02d 100644
> > > > > --- a/drivers/iommu/iommu.c
> > > > > +++ b/drivers/iommu/iommu.c
> > > > > @@ -2427,12 +2427,12 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
> > > > >    	phys_addr_t orig_paddr = paddr;
> > > > >    	int ret = 0;
> > > > > -	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
> > > > > -		return -ENODEV;
> > > > > -
> > > > >    	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
> > > > >    		return -EINVAL;
> > > > 
> > > > Why isn't this a WARN_ON? The caller is clearly lost its mind if it
> > > > is calling map on a non paging domain..
> > > 
> > > Sure it's a dumb thing to do, however I don't think we can reasonably say
> > > without question that an external caller being dumb represents an unexpected
> > > and serious loss of internal consistency.
> > 
> > WARN_ON is not just for "serious loss of internal consistency" it
> > should be use in all places where invariant are violated. We can't
> > guess why the caller is using this wrong (most likely it is UAF or
> > memory corruption), but if this fires something definately has gone
> > wrong with the kernel.
> 
> Has it? It's not *functionally* incorrect to obtain a valid domain by
> calling iommu_get_domain_for_dev(), pass that domain to iommu_map(), and
> handle any failure returned. 

This is not a theoretical question - does any in-kernel code actually
do that and expect it to work? I didn't notice any..

It doesn't matter that someone *could*, our task is not to make an
overly general API. We can, and should, have tight invarients because
it allows discovery of more bug classes, and prevents "creative" use
of APIs.

> is calling iommu_iova_to_phys() on non-paging domains, and we have to
> support that, because callers are dumb, and "callers aren't dumb" is not a
> realistic invariant which we can uphold.

Someone does that? :(

> > eg if syzkaller somehow hits this we want the WARN_ON so it reports
> > it.
> 
> But then an "IOMMU bug" is reported to us, and we say "yeah, that's not
> ours, that's some code somewhere down in the middle of that callstack being
> dumb", and then what?

The report gets handed off to the part of the callstack that is making
the problem.

> iommu_map() can already fail for numerous reasons that may or may not
> represent a bug in the caller, like the IOVA or PA being out-of-range, or
> the IOVA already being mapped, or whatever. 

IMHO some of those certainly could be WARN too, depends what you think
the API should be.

eg iommufd and vfio both are designed to never double map/use out of
aperture IOVA/etc, if they do it is a significant bug and a WARN_ON
would be welcomed.

> another reason to fail (in, as far as we're concerned, an entirely safe and
> manageable way). Callers have to be prepared to handle failure, but it's up
> to them how unexpected or serious it is; we can't second-guess them.

We can always convert things into error codes, but that can also hide
bugs if no existing caller needs that functionality. IMHO it is always
better to have tighter invarients, and be noisy when they fail. This
gives us better understandablity, bug discoverability and discourages
people from introducing new code doing crazy things.

Jason



More information about the linux-arm-kernel mailing list