[PATCH 1/1] sched/debug: fix dentry leak in update_sched_domain_debugfs
Kuyo Chang
kuyo.chang at mediatek.com
Thu Sep 1 23:40:59 PDT 2022
On Fri, 2022-09-02 at 07:26 +0200, Greg Kroah-Hartman wrote:
> On Fri, Sep 02, 2022 at 11:15:15AM +0800, Kuyo Chang wrote:
> > From: kuyo chang <kuyo.chang at mediatek.com>
> >
> > [Syndrome]
> > Lowmemorykiller triggered while doing hotplug stress test as below
> > cmd:
> > echo [0/1] > /sys/devices/system/cpu/cpu${index}/online
> >
> > Rootcause:
> > Call trace of the slab owner & usage as below after hotplug stress
> > test(4hr).
> > There exists dentry leak at update_sched_domain_debugfs.
> >
> > Total size : 322000KB
> > <prep_new_page+44>:
> > <get_page_from_freelist+672>:
> > <__alloc_pages+304>:
> > <allocate_slab+144>:
> > <___slab_alloc+404>:
> > <__slab_alloc+60>:
> > <kmem_cache_alloc+1204>:
> > <alloc_inode+100>:
> > <new_inode+40>:
> > <__debugfs_create_file+172>:
> > <update_sched_domain_debugfs+824>:
> > <partition_sched_domains_locked+1292>:
> > <rebuild_sched_domains_locked+576>:
> > <cpuset_hotplug_workfn+1052>:
> > <process_one_work+584>:
> > <worker_thread+1008>:
> >
> > [Solution]
> > Provided by Major Chen <major.chen at samsung.com> as below link.
> >
https://lore.kernel.org/lkml/20220711030341epcms5p173848e98b13c09eb2fcdf2fd7287526a@epcms5p1/
> > update_sched_domain_debugfs() uses debugfs_lookup() to find wanted
> > dentry(which has
> > been created by debugfs_create_dir() before), but not call dput()
> > to return this dentry
> > back. This result in dentry leak even debugfs_remove() is called.
> >
> > [Test result]
> > Using below commands to check inode_cache & dentry leak.
> > cat /proc/slabinfo | grep -w inode_cache
> > cat /proc/slabinfo | grep -w dentry
> >
> > With the patch, the inode_cache & dentry stays consistent
> > so the lowmemorykiller will not triggered anymore.
> >
> > Fixes: 8a99b6833c88 ("sched: Move SCHED_DEBUG sysctl to debugfs")
> >
> > Signed-off-by: Major Chen <major.chen at samsung.com>
> > Signed-off-by: kuyo chang <kuyo.chang at mediatek.com>
> > Tested-by: kuyo chang <kuyo.chang at mediatek.com>
> >
> > ---
> > kernel/sched/debug.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > index bb3d63bdf4ae..4ffea2dc01da 100644
> > --- a/kernel/sched/debug.c
> > +++ b/kernel/sched/debug.c
> > @@ -412,11 +412,14 @@ void update_sched_domain_debugfs(void)
> >
> > for_each_cpu(cpu, sd_sysctl_cpus) {
> > struct sched_domain *sd;
> > - struct dentry *d_cpu;
> > + struct dentry *d_cpu, *d_lookup;
> > char buf[32];
> >
> > snprintf(buf, sizeof(buf), "cpu%d", cpu);
> > - debugfs_remove(debugfs_lookup(buf, sd_dentry));
> > + d_lookup = debugfs_lookup(buf, sd_dentry);
> > + debugfs_remove(d_lookup);
> > + if (!IS_ERR_OR_NULL(d_lookup))
> > + dput(d_lookup);
>
> That's odd, and means that something else is removing this file right
> after we looked it up? Is there a missing lock here that should be
> used
> instead?
>
> thanks,
>
> greg k-h
While doing cpu hotlug, the cpu_active_mask is changed,
so it need to update_sched_domain_debugfs.
The original design is to recreate sd_dentry, so it doing
debugfs_remove and then debugfs_create_dir.
However, by debugfs_lookup function usage.
The returned dentry must be passed to dput() when it is no longer
needed to avoid dentry leak.
More information about the linux-arm-kernel
mailing list