[LSF/MM/BPF TOPIC] Removing GFP_NOFS
Vlastimil Babka (SUSE)
vbabka at kernel.org
Thu Feb 8 11:55:05 PST 2024
On 2/8/24 18:33, Michal Hocko wrote:
> On Thu 08-02-24 17:02:07, Vlastimil Babka (SUSE) wrote:
>> On 1/9/24 05:47, Dave Chinner wrote:
>> > On Thu, Jan 04, 2024 at 09:17:16PM +0000, Matthew Wilcox wrote:
>>
>> Your points and Kent's proposal of scoped GFP_NOWAIT [1] suggests to me this
>> is no longer FS-only topic as this isn't just about converting to the scoped
>> apis, but also how they should be improved.
>
> Scoped GFP_NOFAIL context is slightly easier from the semantic POV than
> scoped GFP_NOWAIT as it doesn't add a potentially unexpected failure
> mode. It is still tricky to deal with GFP_NOWAIT requests inside the
> NOFAIL scope because that makes it a non failing busy wait for an
> allocation if we need to insist on scope NOFAIL semantic.
>
> On the other hand we can define the behavior similar to what you
> propose with RETRY_MAYFAIL resp. NORETRY. Existing NOWAIT users should
> better handle allocation failures regardless of the external allocation
> scope.
>
> Overriding that scoped NOFAIL semantic with RETRY_MAYFAIL or NORETRY
> resembles the existing PF_MEMALLOC and GFP_NOMEMALLOC semantic and I do
> not see an immediate problem with that.
>
> Having more NOFAIL allocations is not great but if you need to
> emulate those by implementing the nofail semantic outside of the
> allocator then it is better to have those retries inside the allocator
> IMO.
I see potential issues in scoping both the NOWAIT and NOFAIL
- NOFAIL - I'm assuming Dave is adding __GFP_NOFAIL to xfs allocations or
adjacent layers where he knows they must not fail for his transaction. But
could the scope affect also something else underneath that could fail
without the failure propagating in a way that it affects xfs? Maybe it's a
high-order allocation with a low-order fallback that really should not be
__GFP_NOFAIL? We would need to hope it has something like RETRY_MAYFAIL or
NORETRY already. But maybe it just relies on >costly order being more likely
to fail implicitly, and those costly orders should be kept excluded from the
scoped NOFAIL? Maybe __GFP_NOWARN should also override the scoped nofail?
- NOWAIT - as said already, we need to make sure we're not turning an
allocation that relied on too-small-to-fail into a null pointer exception or
BUG_ON(!page). It's probably not feasible to audit everything that can be
called underneath when adding a new scoped NOWAIT. Static analysis probably
won't be powerful enough as well. Kent suggested fault injection [1]. We
have the framework for a system-wide one but I don't know if anyone is
running it and how successful it is. But maybe we could have a special fault
injection mode (CONFIG_ option or something) for the NOWAIT scoped
allocations only. If everything works as expected, there are no crashes and
the pattern Kent described in [1] has a fallback that's slower but still
functional. If not, we get a report and known which place to fix, and the
testing only focuses on the relevant parts. When a new scoped NOWAIT is
added and bots/CIs running this fault injection config report no issues, we
can be reasonably sure it's fine?
[1]
https://lore.kernel.org/all/zup5yilebkgkrypis4g6zkbft7pywqi57k5aztoio2ufi5ujsd@mfnqu4rarort/
>> [1] http://lkml.kernel.org/r/Zbu_yyChbCO6b2Lj@tiehlicka
>>
>> > We already have memalloc_noreclaim_{save/restore}() for turning off
>> > direct memory reclaim for a given context (i.e. equivalent of
>> > clearing __GFP_DIRECT_RECLAIM), so if we are going to embrace scoped
>> > allocation contexts, then we should be going all in and providing
>> > all the contexts that filesystems actually need....
>> >
>> > -Dave.
>
More information about the Linux-nvme
mailing list