PROBLEM: LTP cfs_bandwidth01 test bumped into SCHED_WARN_ON after de-selecting CONFIG_SMP

Fri Aug 18 02:35:14 PDT 2023

Hi all,

We are using upstream buildroot (master branch) to reproduce this problem.
The defconfig we're using is qemu_riscv64_virt_defconfig and the kernel version is 6.4.

After a little bit of git bisecting, we believe is the following commit that cause the issue,
and reverting the commit could fix the problem.

We are not familiar with CFS code, so we are wondering if reverting this patch is the right
thing to do or we should just stay with CONFIG_SMP enabled.

Does anybody has any comments ?

================= This commit is somewhere between v5.18 and v5.19-rc1 =======================
commit 0a00a354644ee1800d31c47cf5927b9b50272fac
Author: Chengming Zhou <zhouchengming at bytedance.com>
Date:   Fri Apr 8 19:53:09 2022 +0800

    sched/fair: Delete useless condition in tg_unthrottle_up()

    We have tested cfs_rq->load.weight in cfs_rq_is_decayed(),
    the first condition "!cfs_rq_is_decayed(cfs_rq)" is enough
    to cover the second condition "cfs_rq->nr_running".

    Signed-off-by: Chengming Zhou <zhouchengming at bytedance.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz at infradead.org>
    Reviewed-by: Ben Segall <bsegall at google.com>
    Reviewed-by: Vincent Guittot <vincent.guittot at linaro.org>
    Link: https://lore.kernel.org/r/20220408115309.81603-2-zhouchengming@bytedance.com

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f74b34080c9a..3eba0dcc4962 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4850,7 +4850,7 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
                                             cfs_rq->throttled_clock_pelt;

                /* Add cfs_rq with load or one or more already running entities to the list */
-               if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq->nr_running)
+               if (!cfs_rq_is_decayed(cfs_rq))
                        list_add_leaf_cfs_rq(cfs_rq);
        }
===============================================================================================

The reproducing step is as follows:
$ cd buildroot
$ make qemu_riscv64_virt_defconfig
$ make menuconfig					## choose 6.4 kernel and choose LTP testsuite
$ make
$ make linux-menuconfig				## de-select CONFIG_SMP
$ make linux-rebuild
$ ./output/images/start-qemu.sh

...
Welcome to Buildroot
buildroot login: root
# /usr/lib/ltp-testsuite/testcases/bin/cfs_bandwidth01
tst_kconfig.c:87: TINFO: Parsing kernel config '/proc/config.gz'
tst_cgroup.c:679: TINFO: Mounted V2 CGroups on /tmp/cgroup_unified
tst_cgroup.c:737: TINFO: Mounted V1 cpu CGroup on /tmp/cgroup_cpu
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 50s
cfs_bandwidth01.c:54: TINFO: Set 'worker1/cpu.max' = '3000 10000'
cfs_bandwidth01.c:54: TINFO: Set 'worker2/cpu.max' = '2000 10000'
cfs_bandwidth01.c:54: TINFO: Set 'worker3/cpu.max' = '3000 10000'
cfs_bandwidth01.c:117: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:54: TINFO: Set 'level2/cpu.max' = '5000 10000'
[   16.625892] ------------[ cut here ]------------
[   16.626169] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list
[   16.626337] WARNING: CPU: 0 PID: 0 at kernel/sched/fair.c:437 unthrottle_cfs_rq+0x3b4/0x3b8
[   16.626781] Modules linked in:
[   16.626988] CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0 #2
[   16.627205] Hardware name: riscv-virtio,qemu (DT)
[   16.627368] epc : unthrottle_cfs_rq+0x3b4/0x3b8
[   16.627511]  ra : unthrottle_cfs_rq+0x3b4/0x3b8
[   16.627640] epc : ffffffff80031f3e ra : ffffffff80031f3e sp : ffffffff81003b10
[   16.627816]  gp : ffffffff810e1078 tp : ffffffff8100d5c0 t0 : ffffffff8101a960
[   16.627989]  t1 : 0720072007200720 t2 : 2d2d2d2d2d2d2d2d s0 : ffffffff81003b90
[   16.628162]  s1 : 0000000000000000 a0 : 000000000000002d a1 : ffffffff810872b8
[   16.628328]  a2 : 0000000000000010 a3 : 0000000000000001 a4 : 0000000000000000
[   16.628498]  a5 : 0000000000000000 a6 : 0000000000000000 a7 : 000000000000002d
[   16.628667]  s2 : ffffffff81016170 s3 : ff60000002232c00 s4 : 0000000000000000
[   16.628853]  s5 : ffffffff81016140 s6 : 0000000000000002 s7 : 0000000000000001
[   16.629021]  s8 : 0000000000000002 s9 : 0000000000000001 s10: 0000000000113833
[   16.629190]  s11: 0000000000989680 t3 : ff60000001218f00 t4 : ff60000001218f00
[   16.629367]  t5 : ff60000001218000 t6 : ffffffff810038f8
[   16.629493] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
[   16.629756] [<ffffffff80032030>] distribute_cfs_runtime+0xee/0x12a
[   16.629946] [<ffffffff800321ee>] sched_cfs_period_timer+0xdc/0x1e6
[   16.630101] [<ffffffff80055432>] __hrtimer_run_queues.constprop.0+0x12a/0x1b0
[   16.630272] [<ffffffff80055e6e>] hrtimer_interrupt+0xe0/0x1f2
[   16.630411] [<ffffffff804c59d2>] riscv_timer_interrupt+0x1c/0x26
[   16.630557] [<ffffffff80044a62>] handle_percpu_devid_irq+0x50/0xd6
[   16.630703] [<ffffffff800409c8>] generic_handle_domain_irq+0x1c/0x2a
[   16.630855] [<ffffffff8030ac0a>] riscv_intc_irq+0x2e/0x46
[   16.630990] [<ffffffff80634dba>] generic_handle_arch_irq+0x34/0x4e
[   16.631139] [<ffffffff80003280>] ret_from_exception+0x0/0xc
[   16.631343] ---[ end trace 0000000000000000 ]---
cfs_bandwidth01.c:129: TPASS: Workers exited
tst_test.c:1601: TFAIL: Kernel is now tainted.

Best regards,
Leo