[PATCH v3 3/3] iova: defer maple tree erase on GFP_ATOMIC failure
Jason Gunthorpe
jgg at ziepe.ca
Mon Jun 15 04:56:33 PDT 2026
On Fri, Jun 12, 2026 at 02:44:06PM -0400, Liam R. Howlett wrote:
> > Currently it never returns a failure to the caller. Look at mas_erase():
> >
> > entry = mas_state_walk(mas);
> > if (!entry)
> > return NULL;
> > [..]
> > if (mas_is_err(mas))
> > goto out;
> > [..]
> > out:
> > mas_destroy(mas);
> > return entry;
> >
> > There is no propogation of ENOMEM, it returns success. No caller
> > checks for any error here either.
>
> At one point this was considered to be impossible to fail, and it is
> documented to return the entry or null.
I think that is the right API design..
> I think I should probably change this to return -ENOMEM, fix the docs on
> it, and probably audit the callers (most use external locks or are
> early-boot-so-don't-worry-about-it). Any issue here should also be
> caught by lockdep pretty quickly.
Locking aside, I don't think most of the callers can handle an -ENOMEM
return at all..
> mas_store_gfp() works if you know the range of your entry. You could
> also write the XA_ZERO_ENTRY over the entry so that there is no internal
> node changes - just a value swap. If you do this, you have to be
> careful when reading things back when using the mas_ interface.
That's probably a good option. Try to store NULL, if that ENOMEM's
then drop a ZERO_ENTRY and set a bit someplace so that the next
allocation has to sweep the tree and clean the ZERO's. At least that
shifts the ENOMEM to an alloc side operation where it can be handled.
> I think, in your case, hitting an XA_ZERO_ENTRY would be necessary to
> indicate that we cannot reuse this particular location until it is
> correctly dealt with? Or is the maple tree the only reason it is
> considered unusable?
Yeah, it would be be a maple tree issue only. Defered rebalancing
leave space unavailable.
This is a case where there is no sane way to handle destroy
failure. You can't return an error code from dma_unmap() for
example. So the only reason for this complexity is because maple tree
exposes a failable erase to it's caller..
IDK, could it handle it internally somehow? Or at least document a
reasonable pattern with some helpers?
Eg _iommufd_destroy_mmap() is in trouble too, it cargo culted the
no-check mt_erase.
Jason
More information about the maple-tree
mailing list