[LSF/MM/BPF TOPIC] Removing GFP_NOFS
Dave Chinner
david at fromorbit.com
Mon Jan 8 20:47:39 PST 2024
On Thu, Jan 04, 2024 at 09:17:16PM +0000, Matthew Wilcox wrote:
> This is primarily a _FILESYSTEM_ track topic. All the work has already
> been done on the MM side; the FS people need to do their part. It could
> be a joint session, but I'm not sure there's much for the MM people
> to say.
>
> There are situations where we need to allocate memory, but cannot call
> into the filesystem to free memory. Generally this is because we're
> holding a lock or we've started a transaction, and attempting to write
> out dirty folios to reclaim memory would result in a deadlock.
>
> The old way to solve this problem is to specify GFP_NOFS when allocating
> memory. This conveys little information about what is being protected
> against, and so it is hard to know when it might be safe to remove.
> It's also a reflex -- many filesystem authors use GFP_NOFS by default
> even when they could use GFP_KERNEL because there's no risk of deadlock.
>
> The new way is to use the scoped APIs -- memalloc_nofs_save() and
> memalloc_nofs_restore(). These should be called when we start a
> transaction or take a lock that would cause a GFP_KERNEL allocation to
> deadlock. Then just use GFP_KERNEL as normal. The memory allocators
> can see the nofs situation is in effect and will not call back into
> the filesystem.
So in rebasing the XFS kmem.[ch] removal patchset I've been working
on, there is a clear memory allocator function that we need to be
scoped: __GFP_NOFAIL.
All of the allocations done through the existing XFS kmem.[ch]
interfaces (i.e just about everything) have __GFP_NOFAIL semantics
added except in the explicit cases where we add KM_MAYFAIL to
indicate that the allocation can fail.
The result of this conversion to remove GFP_NOFS is that I'm also
adding *dozens* of __GFP_NOFAIL annotations because we effectively
scope that behaviour.
Hence I think this discussion needs to consider that __GFP_NOFAIL is
also widely used within critical filesystem code that cannot
gracefully recover from memory allocation failures, and that this
would also be useful to scope....
Yeah, I know, mm developers hate __GFP_NOFAIL. We've been using
these semantics NOFAIL in XFS for over 2 decades and the sky hasn't
fallen. So can we get memalloc_nofail_{save,restore}() so that we
can change the default allocation behaviour in certain contexts
(e.g. the same contexts we need NOFS allocations) to be NOFAIL
unless __GFP_RETRY_MAYFAIL or __GFP_NORETRY are set?
We already have memalloc_noreclaim_{save/restore}() for turning off
direct memory reclaim for a given context (i.e. equivalent of
clearing __GFP_DIRECT_RECLAIM), so if we are going to embrace scoped
allocation contexts, then we should be going all in and providing
all the contexts that filesystems actually need....
-Dave.
--
Dave Chinner
david at fromorbit.com
More information about the Linux-nvme
mailing list