[PATCH 1/1] sched/debug: fix dentry leak in update_sched_domain_debugfs

Greg Kroah-Hartman gregkh at linuxfoundation.org
Thu Sep 1 22:26:31 PDT 2022


On Fri, Sep 02, 2022 at 11:15:15AM +0800, Kuyo Chang wrote:
> From: kuyo chang <kuyo.chang at mediatek.com>
> 
> [Syndrome]
> Lowmemorykiller triggered while doing hotplug stress test as below cmd:
> echo [0/1] > /sys/devices/system/cpu/cpu${index}/online
> 
> Rootcause:
> Call trace of the slab owner & usage as below after hotplug stress
> test(4hr).
> There exists dentry leak at update_sched_domain_debugfs.
> 
> Total size : 322000KB
> <prep_new_page+44>:
> <get_page_from_freelist+672>:
> <__alloc_pages+304>:
> <allocate_slab+144>:
> <___slab_alloc+404>:
> <__slab_alloc+60>:
> <kmem_cache_alloc+1204>:
> <alloc_inode+100>:
> <new_inode+40>:
> <__debugfs_create_file+172>:
> <update_sched_domain_debugfs+824>:
> <partition_sched_domains_locked+1292>:
> <rebuild_sched_domains_locked+576>:
> <cpuset_hotplug_workfn+1052>:
> <process_one_work+584>:
> <worker_thread+1008>:
> 
> [Solution]
> Provided by Major Chen <major.chen at samsung.com> as below link.
> https://lore.kernel.org/lkml/20220711030341epcms5p173848e98b13c09eb2fcdf2fd7287526a@epcms5p1/
> update_sched_domain_debugfs() uses debugfs_lookup() to find wanted dentry(which has
> been created by debugfs_create_dir() before), but not call dput() to return this dentry
> back. This result in dentry leak even debugfs_remove() is called.
> 
> [Test result]
> Using below commands to check inode_cache & dentry leak.
> cat /proc/slabinfo | grep -w inode_cache
> cat /proc/slabinfo | grep -w dentry
> 
> With the patch, the inode_cache & dentry stays consistent
> so the lowmemorykiller will not triggered anymore.
> 
> Fixes: 8a99b6833c88 ("sched: Move SCHED_DEBUG sysctl to debugfs")
> 
> Signed-off-by: Major Chen <major.chen at samsung.com>
> Signed-off-by: kuyo chang <kuyo.chang at mediatek.com>
> Tested-by: kuyo chang <kuyo.chang at mediatek.com>
> 
> ---
>  kernel/sched/debug.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index bb3d63bdf4ae..4ffea2dc01da 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -412,11 +412,14 @@ void update_sched_domain_debugfs(void)
>  
>  	for_each_cpu(cpu, sd_sysctl_cpus) {
>  		struct sched_domain *sd;
> -		struct dentry *d_cpu;
> +		struct dentry *d_cpu, *d_lookup;
>  		char buf[32];
>  
>  		snprintf(buf, sizeof(buf), "cpu%d", cpu);
> -		debugfs_remove(debugfs_lookup(buf, sd_dentry));
> +		d_lookup = debugfs_lookup(buf, sd_dentry);
> +		debugfs_remove(d_lookup);
> +		if (!IS_ERR_OR_NULL(d_lookup))
> +			dput(d_lookup);

That's odd, and means that something else is removing this file right
after we looked it up?  Is there a missing lock here that should be used
instead?

thanks,

greg k-h



More information about the Linux-mediatek mailing list