[LSF/MM/BPF TOPIC] Removing GFP_NOFS
Kent Overstreet
kent.overstreet at linux.dev
Sun Feb 11 18:06:33 PST 2024
On Mon, Feb 12, 2024 at 12:20:32PM +1100, Dave Chinner wrote:
> On Thu, Feb 08, 2024 at 08:55:05PM +0100, Vlastimil Babka (SUSE) wrote:
> > On 2/8/24 18:33, Michal Hocko wrote:
> > > On Thu 08-02-24 17:02:07, Vlastimil Babka (SUSE) wrote:
> > >> On 1/9/24 05:47, Dave Chinner wrote:
> > >> > On Thu, Jan 04, 2024 at 09:17:16PM +0000, Matthew Wilcox wrote:
> > >>
> > >> Your points and Kent's proposal of scoped GFP_NOWAIT [1] suggests to me this
> > >> is no longer FS-only topic as this isn't just about converting to the scoped
> > >> apis, but also how they should be improved.
> > >
> > > Scoped GFP_NOFAIL context is slightly easier from the semantic POV than
> > > scoped GFP_NOWAIT as it doesn't add a potentially unexpected failure
> > > mode. It is still tricky to deal with GFP_NOWAIT requests inside the
> > > NOFAIL scope because that makes it a non failing busy wait for an
> > > allocation if we need to insist on scope NOFAIL semantic.
> > >
> > > On the other hand we can define the behavior similar to what you
> > > propose with RETRY_MAYFAIL resp. NORETRY. Existing NOWAIT users should
> > > better handle allocation failures regardless of the external allocation
> > > scope.
> > >
> > > Overriding that scoped NOFAIL semantic with RETRY_MAYFAIL or NORETRY
> > > resembles the existing PF_MEMALLOC and GFP_NOMEMALLOC semantic and I do
> > > not see an immediate problem with that.
> > >
> > > Having more NOFAIL allocations is not great but if you need to
> > > emulate those by implementing the nofail semantic outside of the
> > > allocator then it is better to have those retries inside the allocator
> > > IMO.
> >
> > I see potential issues in scoping both the NOWAIT and NOFAIL
> >
> > - NOFAIL - I'm assuming Dave is adding __GFP_NOFAIL to xfs allocations or
> > adjacent layers where he knows they must not fail for his transaction. But
> > could the scope affect also something else underneath that could fail
> > without the failure propagating in a way that it affects xfs?
>
> Memory allocaiton failures below the filesystem (i.e. in the IO
> path) will fail the IO, and if that happens for a read IO within
> a transaction then it will have the same effect as XFS failing a
> memory allocation. i.e. it will shut down the filesystem.
>
> The key point here is the moment we go below the filesystem we enter
> into a new scoped allocation context with a guaranteed method of
> returning errors: NOIO and bio errors.
Hang on, you're conflating NOIO to mean something completely different -
NOIO means "don't recurse in reclaim", it does _not_ mean anything about
what happens when the allocation fails, and in particular it definitely
does _not_ mean that failing the allocation is going to result in an IO
error.
That's because in general most code in the IO path knows how to make
effective use of biosets and mempools (which may take some work! you
have to ensure that you're always able to make forward progress when
memory is limited, and in particular that you don't double allocate from
the same mempool if you're blocking the first allocation from
completing/freeing).
> i.e NOFAIL scopes are not relevant outside the subsystem that sets
> it. Hence we likely need helpers to clear and restore NOFAIL when
> we cross an allocation context boundaries. e.g. as we cross from
> filesystem to block layer in the IO stack via submit_bio(). Maybe
> they should be doing something like:
>
> nofail_flags = memalloc_nofail_clear();
NOFAIL is not a scoped thing at all, period; it is very much a
_callsite_ specific thing, and it depends on whether that callsite has a
fallback.
The most obvious example being, as mentioned previously, mempools.
> > - NOWAIT - as said already, we need to make sure we're not turning an
> > allocation that relied on too-small-to-fail into a null pointer exception or
> > BUG_ON(!page).
>
> Agreed. NOWAIT is removing allocation failure constraints and I
> don't think that can be made to work reliably. Error injection
> cannot prove the absence of errors and so we can never be certain
> the code will always operate correctly and not crash when an
> unexepected allocation failure occurs.
You saying we don't know how to test code? Come on, that's just throwing
your hands up and giving up. We can write error injection tests that
cycle through each injection point and test them individuall and then
verify that they've been tested with coverage analysis.
Anyways, NOWAIT is no different here from NORETRY/RETRY_MAYFAIL. We need
to be able to handle allocation failures, period...
More information about the Linux-nvme
mailing list