[Lsf-pc] [LSF/MM/BPF TOPIC] Removing GFP_NOFS
David Sterba
dsterba at suse.cz
Mon Jan 8 09:39:28 PST 2024
On Mon, Jan 08, 2024 at 11:47:11AM +0000, Johannes Thumshirn wrote:
> On 05.01.24 11:57, Jan Kara wrote:
> > Hello,
> >
> > On Thu 04-01-24 21:17:16, Matthew Wilcox wrote:
> >> This is primarily a _FILESYSTEM_ track topic. All the work has already
> >> been done on the MM side; the FS people need to do their part. It could
> >> be a joint session, but I'm not sure there's much for the MM people
> >> to say.
> >>
> >> There are situations where we need to allocate memory, but cannot call
> >> into the filesystem to free memory. Generally this is because we're
> >> holding a lock or we've started a transaction, and attempting to write
> >> out dirty folios to reclaim memory would result in a deadlock.
> >>
> >> The old way to solve this problem is to specify GFP_NOFS when allocating
> >> memory. This conveys little information about what is being protected
> >> against, and so it is hard to know when it might be safe to remove.
> >> It's also a reflex -- many filesystem authors use GFP_NOFS by default
> >> even when they could use GFP_KERNEL because there's no risk of deadlock.
> >>
> >> The new way is to use the scoped APIs -- memalloc_nofs_save() and
> >> memalloc_nofs_restore(). These should be called when we start a
> >> transaction or take a lock that would cause a GFP_KERNEL allocation to
> >> deadlock. Then just use GFP_KERNEL as normal. The memory allocators
> >> can see the nofs situation is in effect and will not call back into
> >> the filesystem.
> >>
> >> This results in better code within your filesystem as you don't need to
> >> pass around gfp flags as much, and can lead to better performance from
> >> the memory allocators as GFP_NOFS will not be used unnecessarily.
> >>
> >> The memalloc_nofs APIs were introduced in May 2017, but we still have
> >> over 1000 uses of GFP_NOFS in fs/ today (and 200 outside fs/, which is
> >> really sad). This session is for filesystem developers to talk about
> >> what they need to do to fix up their own filesystem, or share stories
> >> about how they made their filesystem better by adopting the new APIs.
> 199 - btrfs
All the easy conversions to scoped nofs allocaionts have been done, the
rest requires to add saving the nofs state at the transactions tart, as
said in above. I have a wip series for that, updated every few releases
but it's intrusive and not finished for a testing run. The number of
patches is over 100, doing each conversion separately, the other generic
changes are straightforward.
It's possible to do it incrementally, there's one moster patch (300
edited lines) to add a stub parameter to transaction start,
https://lore.kernel.org/linux-btrfs/20211018173803.18353-1-dsterba@suse.com/ .
There are some counter points in the discussion if it has to be done
like that but IIRC it's not possible, I have examples why not.
More information about the Linux-nvme
mailing list