[LSF/MM/BPF TOPIC] Removing GFP_NOFS

Matthew Wilcox willy at infradead.org
Thu Jan 4 13:17:16 PST 2024


This is primarily a _FILESYSTEM_ track topic.  All the work has already
been done on the MM side; the FS people need to do their part.  It could
be a joint session, but I'm not sure there's much for the MM people
to say.

There are situations where we need to allocate memory, but cannot call
into the filesystem to free memory.  Generally this is because we're
holding a lock or we've started a transaction, and attempting to write
out dirty folios to reclaim memory would result in a deadlock.

The old way to solve this problem is to specify GFP_NOFS when allocating
memory.  This conveys little information about what is being protected
against, and so it is hard to know when it might be safe to remove.
It's also a reflex -- many filesystem authors use GFP_NOFS by default
even when they could use GFP_KERNEL because there's no risk of deadlock.

The new way is to use the scoped APIs -- memalloc_nofs_save() and
memalloc_nofs_restore().  These should be called when we start a
transaction or take a lock that would cause a GFP_KERNEL allocation to
deadlock.  Then just use GFP_KERNEL as normal.  The memory allocators
can see the nofs situation is in effect and will not call back into
the filesystem.

This results in better code within your filesystem as you don't need to
pass around gfp flags as much, and can lead to better performance from
the memory allocators as GFP_NOFS will not be used unnecessarily.

The memalloc_nofs APIs were introduced in May 2017, but we still have
over 1000 uses of GFP_NOFS in fs/ today (and 200 outside fs/, which is
really sad).  This session is for filesystem developers to talk about
what they need to do to fix up their own filesystem, or share stories
about how they made their filesystem better by adopting the new APIs.

My interest in this is that I'd like to get rid of the FGP_NOFS flag.
It'd also be good to get rid of the __GFP_FS flag since there's always
demand for more GFP flags.  I have a git branch with some work in this
area, so there's a certain amount of conference-driven development going
on here too.

We could mutatis mutandi for GFP_NOIO, memalloc_noio_save/restore,
__GFP_IO, etc, so maybe the block people are also interested.  I haven't
looked into that in any detail though.  I guess we'll see what interest
this topic gains.



More information about the Linux-nvme mailing list