[PATCH v8 04/23] slab: add sheaf support for batching kfree_rcu() operations
Harry Yoo
harry.yoo at oracle.com
Thu Nov 27 04:48:41 PST 2025
On Thu, Nov 27, 2025 at 09:33:46PM +0900, Harry Yoo wrote:
> On Thu, Nov 27, 2025 at 11:38:49AM +0000, Jon Hunter wrote:
> >
> >
> > On 31/10/2025 21:32, Daniel Gomez wrote:
> > >
> > >
> > > On 10/09/2025 10.01, Vlastimil Babka wrote:
> > > > Extend the sheaf infrastructure for more efficient kfree_rcu() handling.
> > > > For caches with sheaves, on each cpu maintain a rcu_free sheaf in
> > > > addition to main and spare sheaves.
> > > >
> > > > kfree_rcu() operations will try to put objects on this sheaf. Once full,
> > > > the sheaf is detached and submitted to call_rcu() with a handler that
> > > > will try to put it in the barn, or flush to slab pages using bulk free,
> > > > when the barn is full. Then a new empty sheaf must be obtained to put
> > > > more objects there.
> > > >
> > > > It's possible that no free sheaves are available to use for a new
> > > > rcu_free sheaf, and the allocation in kfree_rcu() context can only use
> > > > GFP_NOWAIT and thus may fail. In that case, fall back to the existing
> > > > kfree_rcu() implementation.
> > > >
> > > > Expected advantages:
> > > > - batching the kfree_rcu() operations, that could eventually replace the
> > > > existing batching
> > > > - sheaves can be reused for allocations via barn instead of being
> > > > flushed to slabs, which is more efficient
> > > > - this includes cases where only some cpus are allowed to process rcu
> > > > callbacks (Android)
> > > >
> > > > Possible disadvantage:
> > > > - objects might be waiting for more than their grace period (it is
> > > > determined by the last object freed into the sheaf), increasing memory
> > > > usage - but the existing batching does that too.
> > > >
> > > > Only implement this for CONFIG_KVFREE_RCU_BATCHED as the tiny
> > > > implementation favors smaller memory footprint over performance.
> > > >
> > > > Also for now skip the usage of rcu sheaf for CONFIG_PREEMPT_RT as the
> > > > contexts where kfree_rcu() is called might not be compatible with taking
> > > > a barn spinlock or a GFP_NOWAIT allocation of a new sheaf taking a
> > > > spinlock - the current kfree_rcu() implementation avoids doing that.
> > > >
> > > > Teach kvfree_rcu_barrier() to flush all rcu_free sheaves from all caches
> > > > that have them. This is not a cheap operation, but the barrier usage is
> > > > rare - currently kmem_cache_destroy() or on module unload.
> > > >
> > > > Add CONFIG_SLUB_STATS counters free_rcu_sheaf and free_rcu_sheaf_fail to
> > > > count how many kfree_rcu() used the rcu_free sheaf successfully and how
> > > > many had to fall back to the existing implementation.
> > > >
> > > > Signed-off-by: Vlastimil Babka <vbabka at suse.cz>
> > >
> > > Hi Vlastimil,
> > >
> > > This patch increases kmod selftest (stress module loader) runtime by about
> > > ~50-60%, from ~200s to ~300s total execution time. My tested kernel has
> > > CONFIG_KVFREE_RCU_BATCHED enabled. Any idea or suggestions on what might be
> > > causing this, or how to address it?
> > >
> >
> > I have been looking into a regression for Linux v6.18-rc where time taken to
> > run some internal graphics tests on our Tegra234 device has increased from
> > around 35% causing the tests to timeout. Bisect is pointing to this commit
> > and I also see we have CONFIG_KVFREE_RCU_BATCHED=y.
>
> Thanks for reporting! Uh, this has been put aside while I was busy working
> on other stuff... but now that we have two people complaining about this,
> I'll allocate some time to investigate and improve it.
>
> It'll take some time though :)
By the way, how many CPUs do you have on your system, and does your
kernel have CONFIG_CODE_TAGGING enabled?
--
Cheers,
Harry / Hyeonggon
More information about the maple-tree
mailing list