[PATCH v3 3/3] iova: defer maple tree erase on GFP_ATOMIC failure

Liam R. Howlett liam at infradead.org
Thu Jun 18 10:27:02 PDT 2026


On 26/06/18 12:24PM, Jason Gunthorpe wrote:
> On Thu, Jun 18, 2026 at 10:50:56AM -0400, Liam R. Howlett wrote:
> 
> > > If that's the case it should be documented like this too :)
> > 
> > Yeah, very much should add some notes here... but also maybe stop them
> > from doing it by adding this to mas_erase():
> > 
> > if (mt_external_lock(mas->tree))
> >         might_alloc(GFP_KERNEL);
> 
> Yeah, and then I'd add a might_alloc() to mtree_erase() as well, it
> can always sleep..

mtree_erase() calls mas_erase(), so I think the one should be enough.

> 
> > Otherwise, I'll just be telling people they didn't read the docs.
> 
> Okay, so to summarize:
> 
> - mas_erase, mtree_erase cannot fail with ENOMEM. The check for ENOMEM
>   inside should be changed to a WARN_ON to document this.
>   Rational: in a GFP_KERNEL context it will sleep forever until it
>   gets memory "too small to fail"

The ENOMEM check inside mas_nomem() is still possible because we try
NOWAIT first, so it is possible that we hit a retry.

The return from mas_nomem() without allocations should be impossible.
So if (!mas->sheaf && !mas->alloc) should be a WARN_ON_ONCE().

I believe this is what you meant anyways?

> 
> - External locks must be sleepable, add a might_alloc() to check for
>   that and document. mtree_erase must be sleepable
> 
> - I like your idea for Rik to try to store NULL to erase, on failure
>   store ZERO_ENTRY, and then set a note on the next alloc to clean the
>   ZERO_ENTRYs?

The ZERO_ENTRY can be found by mas_* functions and will cause issues
with the gap searching.  You also have to be careful to handle these
entries in mas_for_each() loops - you don't want to treat them as valid
entries.

So cleaning them up should be handled sooner rather than later.  I'm
trying to figure out the best place to do this as I do not like the
retry with the timer idea.

Setting a bit to clear them later makes sense, but I am concerned that
leaving reserved space for a prolonged period may cause other failures?
I guess we could ensure that the tree is 'clean of reserved space' prior
to searching for space to but new entries?

We probably don't really need to worry about speed since this is an
error recovery situation that will happen rarely, so correct is all we
need.

Thanks,
Liam



More information about the maple-tree mailing list